[EMBOSS] index RefSeq for EMBOSS
Olivier Friard
olivier.friard at unito.it
Fri Apr 21 15:00:20 UTC 2006
Hi,
I tried to index the RefSeq database:
1) I downloaded all
ftp://ftp.ncbi.nih.gov/refseq/release/complete/complete*.genomic.gbff.gz
file (GB format)
2) gunziped
3) Added the rs_dna entry to my .embossrc file
DB rs_dna [
type: "N"
method: "emblcd"
format: "GB"
dir: "/home/users/friard/data/refseq_genomic/"
file: "*.gbff"
release: ""
comment: "RefSeq Genomic (upd)"
indexdir: "/home/users/friard/data/refseq_genomic/"
]
4) used dbiflat with following arguments (from the directory where files
are stored)
dbiflat
Index a flat file database
Database name: rs_dna
EMBL : EMBL
SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
GB : Genbank, DDBJ
REFSEQ : Refseq
Entry format [SWISS]: REFSEQ
Database directory [.]:
Wildcard database filename [*.dat]: *.gbff
Release number [0.0]:
Index date [00/00/00]:
The indexes were created but when I try to access to a sequence (i.e
seqret rs_rna:NC_000004) then results is not the correct sequence but an
other one with the NC_000004 ID!
I also downloaded the file in FASTA format and tried to index them with
the dbifasta command (format: ncbi) without positive results:
seqret rs_dna:nc_000004
Reads and writes (returns) sequences
Error: Unable to read sequence 'rs_dna:nc_000004'
Died: seqret terminated: Bad value for '-sequence' and no prompt
Does anyone index the RefSeq successfully?
Thank you in advance
--
Olivier Friard
Laboratorio di Biologia Computazionale
Facoltà di Scienze MFN
Università di Torino
via Accademia Albertina 13, 10124 TORINO (Italy)
tel. +39 011 6704689
More information about the EMBOSS
mailing list