[EMBOSS] Sorting errors with DBIFLAT

David Martin d.m.a.martin at dundee.ac.uk
Mon Sep 13 16:36:08 UTC 2004

I have been trying to index swissprot with DBIFLAT. For most sequences
it seems to work just fine, but for my pet sequence it fails, being
unable to find the sequence.

The reason is that it is sorting the list of IDs incorrectly before
writing them to entrynam.idx.

I wrote a short script to asciiify the entrynam.idx file and found the
appropriate entries.

Here is the excerpt:

ENTRY 1775562 :TFH3_HUMAN:475273634:0:1
ENTRY 1775563 :TFH3_MOUSE:475279037:0:1
ENTRY 1775564 :TFH4_HUMAN:475283197:0:1
ENTRY 1775565 :TFH4_MOUSE:475288725:0:1
ENTRY 1775566 :TFH4_PANTR:475293031:0:1
ENTRY 1775567 :TF_HUMAN:475515366:0:1
ENTRY 1775568 :TFL1_ARATH:475296099:0:1
ENTRY 1775569 :TF_MOUSE:475525712:0:1
ENTRY 1775570 :TFOX_HAEIN:475300602:0:1
ENTRY 1775571 :TFP2_HUMAN:475304815:0:1
ENTRY 1775572 :TFP2_MOUSE:475312007:0:1

This is most definitely not alphabetical order.

Any ideas on why/wherefor and fixes?


