[Biojava-l] AGAVE IO in BioJava

Thomas Down td2@sanger.ac.uk
Wed, 12 Sep 2001 21:45:14 +0100


On Wed, Sep 12, 2001 at 12:32:36PM -0700, Brian King wrote:
> 
> We'd like to add IO support for the AGAVE XML format
> (http://www.agavexml.org) into the BioJava library.  Creating the output
> code in a SequenceFormat seems straightforward, but I'm confused about
> input.  AGAVE and BioJava have hierarchies of features and hierarchies of
> sequences to represent assembly, but the SeqIOListener doesn't have
> interfaces for sub-features and sub-sequences.  Do I ignore this for now or
> is there a way to do it?  

Yes, we definitely do support feature hierarchies.  The
SeqIOListener interface works quite a lot like the SAX
ContentHandler:  You do something like:

   startFeature(container)
     startFeature(child1);
     endFeature();
     startFeature(child2);
     endFeature();
   endFeature();

Generally, listeners are expected to maintain a simple stack
to keep track of position in the hierarchy.

For an example of how we parse an XML format which allows
hierarchical features, look in the package:

     org.biojava.bio.program.xff.

Hierarchical sequences...  well, we have a mechanism for representing
them in BioJava, using ComponentFeatures.  However, I don't think
the issue of sequence files containing complete sequence hierarchies
has come up yet -- it maybe doesn't quite fit in with the current
SeqIOListener.

The current template for ComponentFeature expects the sub-sequence
to already exist before the ComponentFeature is created.  I don't
know how practical this will be in this case.

Hmmm...  this one might take a bit of thought...

    Thomas.