[EMBOSS] sequence retrieval

Peter Rice pmr at ebi.ac.uk
Tue Jun 10 16:11:50 UTC 2008


Jay wrote:
> I have a large file with sequences in fasta format. They have IDs.
> 
> Is there any EMBOSS way to retrieve sequences by inputting a text file with
> a short listed IDs?

With EMBOSS you can refer to sequences in the file:

filename:id

You can also put a list of these into a file, and use that with
@listfilename

But this can be slow - it will read the file for each ID. You can also
index the file with dbxfasta (or dbifasta) as a private database then
define a database in your .embossrc file and use the dbname:id syntax
(again you can use a list file, but it will be much faster)

Hope this helps. If you need more help setting up please ask again!

regards,

Peter



More information about the EMBOSS mailing list