trouble indexing fasta database

Gene Cutler cutler at tularik.com
Wed Oct 17 21:01:26 UTC 2001


I am trying to index a fasta database and then retrieve sequences with seqret.
I used dbifasta on half of the human genome golden path from ensembl:

dbifasta -idformat simple -dbname ensembl_gp1 -directory . -filenames 
'ensembl_golden_path.1' -indexdirectory /scratch1/databases/emboss/

and put this in my emboss.default:

DB ensembl_gp1 [
         type: N
         method: emblcd
         format: fasta
         dir: /scratch1/databases/raw/
         file: ensembl_golden_path.1
         indexdir: /scratch1/databases/emboss/
]


If I try to fetch a sequence located in the beginning of the fasta 
file,  it works fine.  But sequences further into the file aren't 
retrieved:

>  seqret ensembl_gp1:AC007798.1.15164.16328
Reads and writes (returns) sequences
    An error has been found: EMBLCD Entry failed
    An error has been found: Database 'ensembl_gp1' : access method 
'emblcd' failed
    An error has been found: option -sequence: Unable to read sequence 
'ensembl_gp1:AC007798.1.15164.16328'
    There is a serious problem: seqret terminated: Bad value for 
option and no prompt

However, when I tried retrieving all the sequences:
>  seqret ensembl_gp1:*
I got them all back.

I can also retrieve the sequences if I get them directly from the 
fasta file rather than using the index.  That makes it seem to be an 
index problem.  Recreating the indexes hasn't helped.






More information about the EMBOSS mailing list