[Biojava-l] AGAVE IO in BioJava
Thomas Down
td2@sanger.ac.uk
Wed, 12 Sep 2001 21:45:14 +0100
On Wed, Sep 12, 2001 at 12:32:36PM -0700, Brian King wrote:
>
> We'd like to add IO support for the AGAVE XML format
> (http://www.agavexml.org) into the BioJava library. Creating the output
> code in a SequenceFormat seems straightforward, but I'm confused about
> input. AGAVE and BioJava have hierarchies of features and hierarchies of
> sequences to represent assembly, but the SeqIOListener doesn't have
> interfaces for sub-features and sub-sequences. Do I ignore this for now or
> is there a way to do it?
Yes, we definitely do support feature hierarchies. The
SeqIOListener interface works quite a lot like the SAX
ContentHandler: You do something like:
startFeature(container)
startFeature(child1);
endFeature();
startFeature(child2);
endFeature();
endFeature();
Generally, listeners are expected to maintain a simple stack
to keep track of position in the hierarchy.
For an example of how we parse an XML format which allows
hierarchical features, look in the package:
org.biojava.bio.program.xff.
Hierarchical sequences... well, we have a mechanism for representing
them in BioJava, using ComponentFeatures. However, I don't think
the issue of sequence files containing complete sequence hierarchies
has come up yet -- it maybe doesn't quite fit in with the current
SeqIOListener.
The current template for ComponentFeature expects the sub-sequence
to already exist before the ComponentFeature is created. I don't
know how practical this will be in this case.
Hmmm... this one might take a bit of thought...
Thomas.