problem with dbiflat (?)
axel klenk
axel.klenk at morphochem.ch
Wed Jan 29 16:34:04 UTC 2003
Hi all,
I have a problem with dbiflat (I suppose) and the index created for
SWISS-PROT 40.41 yesterday. Some sequence ids (9,401 to be
precise) cannot be retrieved by any EMBOSS program as single
sequences, but they are found when searching with wildcards (see
examples below). This happens only with SWISS-PROT, there are
no problems with TrEMBL, EMBL nor any other of our databases;
and it happens with dbiflat indexes from EMBOSS 2.4.1 and 2.6.0.
The package has been built using gcc 2.95.3 on Solaris 8.
Is this a known problem and are there any solutions for it? I have
attached some funny examples and a debug file that might help.
Thanks in advance,
- axel klenk
----------------------------------------
axel klenk
morphochem AG
wro-1055
schwarzwaldallee 215
4058 basel
tel. ++41-61-6952104
fax ++41-61-6952122
axel.klenk at morphochem.ch
http://www.morphochem.ch
Details: dbiflat builds the index without any complaint:
mbsun01:/data/bioinfo/emboss/swissprot/40.41> dbiflat
Index a flat file database
EMBL : EMBL
SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
GB : Genbank, DDBJ
REFSEQ : Refseq
Entry format [SWISS]:
Database directory [.]: /data/bioinfo/db/swissprot/latest
Wildcard database filename [*.dat]: sprot.dat
Database name: sw
Release number [0.0]: 40.41
Index date [00/00/00]: 01/29/03
mbsun01:/data/bioinfo/emboss/swissprot/40.41> ll
total 10018
-rw-r--r-- 1 bioinfo bioinfo 591864 Jan 29 16:36 acnum.hit
-rw-r--r-- 1 bioinfo bioinfo 2068590 Jan 29 16:36 acnum.trg
-rw-r--r-- 1 bioinfo bioinfo 346 Jan 29 16:36 division.lkp
-rw-r--r-- 1 bioinfo bioinfo 2430600 Jan 29 16:36 entrynam.idx
it finds: dyr*_ecoli and dyrf_ecoli but not dyr_ecoli nor dyra_ecoli nor
dyra*
and only some dyrb*s:
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyr\*_ecoli
Displays some simple information about sequences
# USA Name Accession Type Length Description
sw-id:DYR1_ECOLI DYR1_ECOLI P00382 P 157 Dihydrofolate
reductase type I (EC 1.5.1.3) (Trimethoprim resistance protein).
sw-id:DYR5_ECOLI DYR5_ECOLI P11731 P 157 Dihydrofolate
reductase type V (EC 1.5.1.3).
sw-id:DYR7_ECOLI DYR7_ECOLI P27422 P 157 Dihydrofolate
reductase type VII (EC 1.5.1.3).
sw-id:DYR8_ECOLI DYR8_ECOLI Q57452 P 169 Dihydrofolate
reductase type VIII (EC 1.5.1.3) (DHFR type IIIC).
sw-id:DYR9_ECOLI DYR9_ECOLI Q59397 P 177 Dihydrofolate
reductase type IX (EC 1.5.1.3).
sw-id:DYRA_ECOLI DYRA_ECOLI Q04515 P 187 Dihydrofolate
reductase type X (EC 1.5.1.3).
sw-id:DYRC_ECOLI DYRC_ECOLI Q59408 P 165 Dihydrofolate
reductase type XIII (EC 1.5.1.3).
sw-id:DYRF_ECOLI DYRF_ECOLI P78218 P 157 Dihydrofolate
reductase type XV (EC 1.5.1.3).
sw-id:DYR_ECOLI DYR_ECOLI P00379 P 159 Dihydrofolate
reductase (EC 1.5.1.3).
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyrf_ecoli
Displays some simple information about sequences
# USA Name Accession Type Length Description
sw-id:DYRF_ECOLI DYRF_ECOLI P78218 P 157 Dihydrofolate
reductase type XV (EC 1.5.1.3).
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyra_ecoli
Displays some simple information about sequences
Error: Database Entry 'dyra_ecoli' not found
Error: Unable to read sequence 'sw:dyra_ecoli'
Died: infoseq terminated: Bad value for option [sequence] and no prompt
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyr_ecoli
Displays some simple information about sequences
Error: Database Entry 'dyr_ecoli' not found
Error: Unable to read sequence 'sw:dyr_ecoli'
Died: infoseq terminated: Bad value for option [sequence] and no prompt
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyra\*
Displays some simple information about sequences
Error: Database Query 'dyra*' not found
Error: Unable to read sequence 'sw:dyra*'
Died: infoseq terminated: Bad value for option [sequence] and no prompt
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyrb\*
Displays some simple information about sequences
# USA Name Accession Type Length Description
sw-id:DYRB_MOUSE DYRB_MOUSE Q9Z188 P 589 Dual-specificity
tyrosine-phosphorylation regulated kinase 1B (EC 2.7.1.-).
sw-id:DYRB_STAAM DYRB_STAAM P10167 P 158 Dihydrofolate
reductase type I (EC 1.5.1.3).
mbsun01:/export/home/aklenk/tmp> infoseq sw:dyr\* | grep -i dyrb
Displays some simple information about sequences
sw-id:DYRB_HUMAN DYRB_HUMAN Q9Y463 P 629 Dual-specificity
tyrosine-phosphorylation regulated kinase 1B (EC 2.7.1.-) (Mirk protein
kinase).
sw-id:DYRB_MOUSE DYRB_MOUSE Q9Z188 P 589 Dual-specificity
tyrosine-phosphorylation regulated kinase 1B (EC 2.7.1.-).
sw-id:DYRB_STAAM DYRB_STAAM P10167 P 158 Dihydrofolate
reductase type I (EC 1.5.1.3).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: infoseq.dbg
Type: application/octet-stream
Size: 17537 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20030129/2038efd0/attachment-0001.obj>
More information about the EMBOSS
mailing list