[Biojava-l] Added Genbank writeSequence

Keith James kdj@sanger.ac.uk
11 Mar 2001 22:16:44 +0000


I've added a GenbankFileFormer class, analagous to EmblFileFormer, to
enable Genbank writing. Both EMBL and Genbank writing are as complete
as the parsing (i.e. limited header information), but see the bug
below. With this fixed, we should be able to convert EMBL/Genbank
without loss.

Unbounded fuzzy locations are not being written correctly for
locations with unbounded min (but are fine for unbounded max). This
seems to be due to a bug in CompoundLocation, which I can't
locate. The symptom is that I can't cast (Fuzzy) Locations obtained
from a CompoundLocation to FuzzyLocation where hasBoundedMin is false,
while it is fine where hasBoundedMax is false. The odd thing is, I had
no problem with the cast immediately before the object was added to a
CompoundLocation.

The code for generating an EMBL/Genbank location string from a
Feature's location is in the static method formatLocationBlock in
org.biojava.bio.seq.io.SeqFormatTools.

I've taken the liberty of adding a static final Map FORMATS to the
SequenceFormat interface where SequenceFormat implementations can
register the format(s) accepted by their writeSequence method. This
became necessary for EmblLikeFormat which needs to write EMBL and
Swissprot (and possibly others later). The interface now has
getFormats() and getDefaultFormat() methods.

Finally, features are not retrievable from a FeatureHolder in a
defined order. Ordering by location is required for Genbank/EMBL
writing, so I have added a Comparator to the Feature interface which
compares by the int returned by getMin() of their top-level
Location. As the Location interface has an analagous Comparator nested
in it, this seemed a consistent place to put it.

The build seems okay, but if the changes have caused any problems, let
me know.

cheers,

-- 

-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA