network USA

David Mathog mathog at mendel.bio.caltech.edu
Wed Apr 17 19:05:47 UTC 2002


Today I finally realized that the NCBi's PmFetch cgi 

  http://www.ncbi.nlm.nih.gov:80/entrez/utils/pmfetch_help.html

can be used to retrieve data via gi using a "simple" URL like this:

wget -O dmwhite.genbank \
'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=10873&report=gen&mode=text'

Unfortunately it seems not to be able to retrieve by either accession
number or
locus name - I'm still waiting to hear if there is some other NCBI 
interface for that.

Which is a long way of coming around to considering how a USA could be
used to retrieve remote sequences without exposing end users to truly
hideous
constructs.  The semantics of accessing arbitrary network databases are
probably much too complex to include in the USA but one can imagine
burying
these details under new types of "database" entries in the defaults
file. Something like this:

DB gigenbank [
  method: remoteurlbyid
  comment: "GENBANK at NCBI by gi number"
  format: -
  dir: -
  file: -
  type: N
#optional
  target:
'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=$ID&report=gen&mode=text'
  filter: 'wget -O - $target'
]

Which would then allow something like this to work transparently:

% seqret gigenbank:10873

The USA already has the "program" option but I think in a situation like
this it's
much too complex to actually use.  How many users are going to be able
to successfully negotiate this:

% seqret -sequence=fasta::"wget -O -
'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=10873&report=fasta&mode=text'
|" -filter

Anyway, what I'm proposing is that the database definition be extended
slightly
to allow remote accesss methods.  This would be particularly helpful for
people
running EMBOSS on their own PCs or Macs, who tend not to have large
local databases installed.

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech




More information about the EMBOSS mailing list