[EMBOSS] Using seqret to fetch from .nal index databases

Peter Rice pmr at ebi.ac.uk
Mon Aug 1 15:49:26 UTC 2005

Audra Johnson wrote:

> Apologies for the length, but I want to be thorough.  I'm doing blast  
> searches and then trying to fetch the sequences from the our genembl  
> database using seqret.  For example:
> blastall -p tblastn /gcgdata_10.3/gcgblast/genembl -i  
> dp00061_disordered_115_168.fasta

> I've tried using a seqret just for the database name I'm giving  
> blastall, and specifically saying the genembl.nal file:
> $ seqret
> Reads and writes (returns) sequences
> Input sequence(s): /gcgdata_10.3/gcgblast/genembl.nal:HUMRPA70KD
> Error: Unable to read sequence '/gcgdata_10.3/gcgblast/ 
> genembl.nal:HUMRPA70KD'

EMBOSS cannot read blast database files directly.

EMBOSS can read the old format index files produced by formatdb, but not the 
latest ones - we have not yet decoded the index format.

If you still have the FASTA format file /gcgdata_10.3/gcgblast/genembl then 
you can index this file with dbifasta (or dbxfasta using the new beta version 
indexing) and define it as a database for seqret - of use it in seqret the way 
you aer trying to use the .nal file (but this will be slow as it has to read 
the file until it fins the entry).

If you have the same database in GCG format you can index it with dbigcg (or 
dbxgcg). If you use dbigcg or dbxgcg you may be able to use the GB_PR "gcg" 
database names in EMBOSS, by reusing the dbigcg indices with different files 
selected or excluded.

Let me know if you would like more hints on how to do any of these.


Peter Rice

More information about the EMBOSS mailing list