index fasta DB file..
Peter Rice
pmr at ebi.ac.uk
Thu Mar 20 16:15:04 UTC 2003
Vasudevan, Geetha wrote:
> Is is possible to index using dbifasta, a fasta DB file whose header is like this,
>
> (>DBID 00001, species followed by description) ?
>
> And, is it possible to "retrieve" a sequence from this file, given a "DBID number"?
The syntax must match something dbifasta understands. See the dbifasta
documentation for more information.
DBID should be some 'standard' fasta identifier. EMBOSS is happy with
anything in test/data/testids.fasta or test/data/testids.ncbi
For example:
>dbname:id
or
>id
In both cases, filename:id will extract that ID
You can also read the accession number, if it appears as the next text
on the line:
>DBID A00001 species followed by description
then you can use filename:a00001
... but this only works if (a) the accession number is a valid
EMBL/SwissProt accession number and (b) it has white space either side.
This format of fasta file is (or was) used by ACEDB at the Sanger Centre.
Hope this helps,
Peter Rice
More information about the EMBOSS
mailing list