trouble indexing fasta database
Gene Cutler
cutler at tularik.com
Wed Oct 17 21:01:26 UTC 2001
I am trying to index a fasta database and then retrieve sequences with seqret.
I used dbifasta on half of the human genome golden path from ensembl:
dbifasta -idformat simple -dbname ensembl_gp1 -directory . -filenames
'ensembl_golden_path.1' -indexdirectory /scratch1/databases/emboss/
and put this in my emboss.default:
DB ensembl_gp1 [
type: N
method: emblcd
format: fasta
dir: /scratch1/databases/raw/
file: ensembl_golden_path.1
indexdir: /scratch1/databases/emboss/
]
If I try to fetch a sequence located in the beginning of the fasta
file, it works fine. But sequences further into the file aren't
retrieved:
> seqret ensembl_gp1:AC007798.1.15164.16328
Reads and writes (returns) sequences
An error has been found: EMBLCD Entry failed
An error has been found: Database 'ensembl_gp1' : access method
'emblcd' failed
An error has been found: option -sequence: Unable to read sequence
'ensembl_gp1:AC007798.1.15164.16328'
There is a serious problem: seqret terminated: Bad value for
option and no prompt
However, when I tried retrieving all the sequences:
> seqret ensembl_gp1:*
I got them all back.
I can also retrieve the sequences if I get them directly from the
fasta file rather than using the index. That makes it seem to be an
index problem. Recreating the indexes hasn't helped.
More information about the EMBOSS
mailing list