[EMBOSS] Remote Databases and possibly drcat

David Bauer david.bauer at bayer.com
Fri Dec 5 07:50:36 UTC 2014


Hi Bruce,

the genbank is done via ncbi entrez.
With "showserver" you can see where the configuration file is located (default is /usr/local/emboss/share/EMBOSS/server.entrez).
In this file you can see, which ncbi databases are configured and which fields can be queried.
So to get a refseq  you can use:  "entret entrez:nucleotide:NM_....."
But for unknown reasons on my system it takes in the range of a minute before the entry is returned. This issue can be related to our firewall.

Therefore I use for ncbi an alternative solution:
I have configured this database definition:
--------------------
DB ncbin [
        type: N
        format: genbank
        method: app
        app: "/usr/local/emboss/bin/ncbi_fetch nucleotide:%s"
        comment: "NCBI GenBank Nucleotide" ]
---------------------
ncbi_fetch is  a simple perl script using wget to call the ncbi eutils.
-------------------------------
#!/usr/local/bin/perl
# %s passed from emboss entret or seqret
$id=$ARGV[0];
$entry=`/usr/local/bin/wget --wait=3 --waitretry=3 -q --cache=off --timeout=10 -O - "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=$id&rettype=gb&retformat=text"`;
print $entry;
--------------------------------

Hope this helps.

Kind regards,
David.

Von: emboss-bounces+david.bauer=bayer.com at mailman.open-bio.org [mailto:emboss-bounces+david.bauer=bayer.com at mailman.open-bio.org] Im Auftrag von Citron, Bruce A.
Gesendet: 04 December 2014 20:07
An: EMBOSS at mailman.open-bio.org
Betreff: [EMBOSS] Remote Databases and possibly drcat

What do I need to do, or have our IT guy adjust, to be able to grab sequences from ncbi (genbank, refseq, etc.) into my directory on a LINUX mainframe?  This used to work under GCG via net fetch.  I'm using EMBOSS 6.6.0.0, command line.

I can use seqret to grab embl sequences (seqret embl:accession) and that works fine, however none of the synonyms that I can think of for genbank work.

Should our system have access through drcat?  I do see Genbank, and many other remotes listed there.  I'm not sure how that works.

I'm also not sure why it accesses embl sequences because embl isn't listed by showdb.

showdb produces:
Display information on configured databases
# Name          Type     Comment
# ============= ======== =======
taxon           Taxonomy NCBI taxonomy
drcat           Resource Data Resource Catalogue
chebi           Obo      Chemical Entities of Biological Interest
eco             Obo      Evidence code ontology
edam            Obo      EMBRACE Data and Methods ontology
edam_data       Obo      EMBRACE Data and Methods ontology (data)
edam_format     Obo      EMBRACE Data and Methods ontology (formats)
edam_identifier Obo      EMBRACE Data and Methods ontology (identifiers)
edam_operation  Obo      EMBRACE Data and Methods ontology (operations)
edam_topic      Obo      EMBRACE Data and Methods ontology (topics)
go              Obo      Gene Ontology
go_component    Obo      Gene Ontology (cellular components)
go_function     Obo      Gene Ontology (molecular functions)
go_process      Obo      Gene Ontology (biological processes)
pw              Obo      Pathways ontology
ro              Obo      Relations ontology
so              Obo      Sequence ontology
swo             Obo      Software ontology

# Name          Type     Comment
# ============= ======== =======
taxon           Taxonomy NCBI taxonomy
drcat           Resource Data Resource Catalogue
chebi           Obo      Chemical Entities of Biological Interest
eco             Obo      Evidence code ontology
edam            Obo      EMBRACE Data and Methods ontology
edam_data       Obo      EMBRACE Data and Methods ontology (data)
edam_format     Obo      EMBRACE Data and Methods ontology (formats)
edam_identifier Obo      EMBRACE Data and Methods ontology (identifiers)
edam_operation  Obo      EMBRACE Data and Methods ontology (operations)
edam_topic      Obo      EMBRACE Data and Methods ontology (topics)
go              Obo      Gene Ontology
go_component    Obo      Gene Ontology (cellular components)
go_function     Obo      Gene Ontology (molecular functions)
go_process      Obo      Gene Ontology (biological processes)
pw              Obo      Pathways ontology
ro              Obo      Relations ontology
so              Obo      Sequence ontology
swo             Obo      Software ontology

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/emboss/attachments/20141205/a011cb4e/attachment-0001.html>


More information about the EMBOSS mailing list