[EMBOSS] db records retrieve

Peter Rice pmr at ebi.ac.uk
Fri Apr 8 11:04:52 UTC 2011


On 08/04/2011 11:27, Alessandro Bruselles wrote:
> Hi all,
> I'm not able to retrieve sequences starting from official gene symbols (e.g.
> BCL2L10),
> what db do I have to use?

You need a database you can search by gene name. If the database has 
access through a website you can use the URL access method ... providing 
a URL that returns with the sequence in a format that EMBOSS can read.

For UniProt:

DB uniprotgene [
   type: "protein"
   format: "swiss"
   methodquery: "url"
   url: 
"http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz?-e+[uniprot-gen:%s]+-ascii"
]

(the URL should be all in one line)

We will have other ways to do this in EMBOSS 6.4.0 when we enable gen 
and other database search fields.

> This is the output of whichdb:
>
> *$ whichdb BCL2L10 -debug
> Search all sequence databases for an entry and retrieve it
> Output file [outfile.whichdb]: stdout
> Warning: Cannot open division file '<null>' for database 'tsw'
> Warning: seqCdQry failed
> Warning: Cannot open division file '<null>' for database 'emblnew'
> Warning: seqCdQry failed
> Warning: Cannot open division file '<null>' for database 'trembl'
> Warning: seqCdQry failed
> Warning: Cannot open division file '<null>' for database 'nbrf'
>
>     EMBOSS An error in ajseqdb.c at line 5350:
> seqCdQryOpen failed*

You need to fix or comment out the tsw, emblnew, trembl and nbrf 
databases to avoid those messages. whichdb tries all sequence databases 
and is a good way to catch database that are not working (servers down, 
data or index files not in the expected directory)

Hope this helps

Peter



More information about the EMBOSS mailing list