network USA

David Martin dmartin at bioinformatics.msiwtb.dundee.ac.uk
Wed Apr 17 19:32:40 UTC 2002


On Wed, 17 Apr 2002, David Mathog wrote:

> Today I finally realized that the NCBi's PmFetch cgi
>
>   http://www.ncbi.nlm.nih.gov:80/entrez/utils/pmfetch_help.html
>
> can be used to retrieve data via gi using a "simple" URL like this:
>
> wget -O dmwhite.genbank \
> 'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=10873&report=gen&mode=text'
>
> Unfortunately it seems not to be able to retrieve by either accession
> number or
> locus name - I'm still waiting to hear if there is some other NCBI
> interface for that.
>
> Which is a long way of coming around to considering how a USA could be
> used to retrieve remote sequences without exposing end users to truly
> hideous
> constructs.  The semantics of accessing arbitrary network databases are
> probably much too complex to include in the USA but one can imagine
> burying
> these details under new types of "database" entries in the defaults
> file. Something like this:

Try 'method: url' and using %s instead of $ID. It has been there from
EMBOSS 0.0.4 to
allow retrieval from remote srs servers (or indeed any arbitrary web
address where the id can be passed in the url).

Around page 19-20 in the admin guide.

If it doesn't work then let the guilty parties know.

..d

>
> DB gigenbank [
>   method: remoteurlbyid
>   comment: "GENBANK at NCBI by gi number"
>   format: -
>   dir: -
>   file: -
>   type: N
> #optional
>   target:
> 'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=$ID&report=gen&mode=text'
>   filter: 'wget -O - $target'
> ]
>
> Which would then allow something like this to work transparently:
>
> % seqret gigenbank:10873
>
> The USA already has the "program" option but I think in a situation like
> this it's
> much too complex to actually use.  How many users are going to be able
> to successfully negotiate this:
>
> % seqret -sequence=fasta::"wget -O -
> 'http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch.fcgi?db=Nucleotide&id=10873&report=fasta&mode=text'
> |" -filter
>
> Anyway, what I'm proposing is that the database definition be extended
> slightly
> to allow remote accesss methods.  This would be particularly helpful for
> people
> running EMBOSS on their own PCs or Macs, who tend not to have large
> local databases installed.
>
> Regards,
>
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
>

----------------------------------
David Martin PhD
Bioinformatics Scientific Officer
Wellcome Trust Biocentre, Dundee
----------------------------------




More information about the EMBOSS mailing list