[EMBOSS] index RefSeq with dbxflat

Olivier Friard olivier.friard at unito.it
Wed Apr 26 10:29:51 UTC 2006


Hello,

Thank you for your kindly help for indexing refseq.


I try to index RefSeq DNA db using the dbxflat program with the 
following arguments:

dbxflat
Database b+tree indexing for flat file databases
Basename for index files: rs_dna
Resource name: rs_dna
       EMBL : EMBL
      SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
         GB : Genbank, DDBJ
     REFSEQ : Refseq
Entry format [SWISS]: REFSEQ
Wildcard database filename [*.dat]: *.gbff
Database directory [.]: /home/users/friard/data/refseq_genomic
         id : ID
        acc : Accession number
         sv : Sequence Version and GI
        des : Description
        key : Keywords
        org : Taxonomy
Index fields [id,acc]:

I included these records in my .embossrc file:

DB rs_dna [
     type: "N"
     method: "emboss"
     dbalias: "rs_dna"
     format: "genbank"
     directory: "/home/users/friard/data/refseq_genomic/"
     file: "*.gbff"
     comment: "RefSeq DNA (dbxflat)"
]

RES rs_dna [
    type: Index
    idlen:  15
    acclen: 15
    svlen:  15
    keylen: 15
    deslen: 15
    orglen: 15
]

but when I try to retrieve a single sequence with its AC (seqret 
rs_dna:NC_001911) the program fails with this error message:

seqret rs_dna:NC_001191
Reads and writes (returns) sequences
Error: Unable to read sequence 'rs_dna:NC_001191'
Died: seqret terminated: Bad value for '-sequence' and no prompt

when I try to retrieve all sequences with "seqret rs_dna:* -out 
fasta::refseq.fasta" and everything works well

I try to use dbxfasta with the *.fna files (modifying the .embossrc file 
with "fasta" value) but I obtained the same error.

Any idea about the problem?

Thank you in advance

Olivier Friard






More information about the EMBOSS mailing list