[EMBOSS] index RefSeq with dbxflat
    Olivier Friard 
    olivier.friard at unito.it
       
    Wed Apr 26 10:29:51 UTC 2006
    
    
  
Hello,
Thank you for your kindly help for indexing refseq.
I try to index RefSeq DNA db using the dbxflat program with the 
following arguments:
dbxflat
Database b+tree indexing for flat file databases
Basename for index files: rs_dna
Resource name: rs_dna
       EMBL : EMBL
      SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
         GB : Genbank, DDBJ
     REFSEQ : Refseq
Entry format [SWISS]: REFSEQ
Wildcard database filename [*.dat]: *.gbff
Database directory [.]: /home/users/friard/data/refseq_genomic
         id : ID
        acc : Accession number
         sv : Sequence Version and GI
        des : Description
        key : Keywords
        org : Taxonomy
Index fields [id,acc]:
I included these records in my .embossrc file:
DB rs_dna [
     type: "N"
     method: "emboss"
     dbalias: "rs_dna"
     format: "genbank"
     directory: "/home/users/friard/data/refseq_genomic/"
     file: "*.gbff"
     comment: "RefSeq DNA (dbxflat)"
]
RES rs_dna [
    type: Index
    idlen:  15
    acclen: 15
    svlen:  15
    keylen: 15
    deslen: 15
    orglen: 15
]
but when I try to retrieve a single sequence with its AC (seqret 
rs_dna:NC_001911) the program fails with this error message:
seqret rs_dna:NC_001191
Reads and writes (returns) sequences
Error: Unable to read sequence 'rs_dna:NC_001191'
Died: seqret terminated: Bad value for '-sequence' and no prompt
when I try to retrieve all sequences with "seqret rs_dna:* -out 
fasta::refseq.fasta" and everything works well
I try to use dbxfasta with the *.fna files (modifying the .embossrc file 
with "fasta" value) but I obtained the same error.
Any idea about the problem?
Thank you in advance
Olivier Friard
    
    
More information about the EMBOSS
mailing list