index fasta DB file..

Peter Rice pmr at ebi.ac.uk
Thu Mar 20 16:15:04 UTC 2003


Vasudevan, Geetha wrote:
> Is is possible to index using dbifasta, a fasta DB file whose header is like this,
> 
> (>DBID 00001, species followed by description) ?
> 
> And, is it possible to "retrieve" a sequence from this file, given a "DBID number"? 

The syntax must match something dbifasta understands. See the dbifasta 
documentation for more information.

DBID should be some 'standard' fasta identifier. EMBOSS is happy with 
anything in test/data/testids.fasta or test/data/testids.ncbi

For example:

 >dbname:id

or

 >id

In both cases, filename:id will extract that ID

You can also read the accession number, if it appears as the next text 
on the line:

 >DBID A00001 species followed by description

then you can use filename:a00001

... but this only works if (a) the accession number is a valid 
EMBL/SwissProt accession number and (b) it has white space either side. 
This format of fasta file is (or was) used by ACEDB at the Sanger Centre.

Hope this helps,

Peter Rice




More information about the EMBOSS mailing list