[Biojava-l] Stax, Delegation manager and other mysteries

Thomas Down td2@sanger.ac.uk
Thu, 24 May 2001 18:50:41 +0100


On Thu, May 24, 2001 at 04:38:52PM +0100, Dr S.M. Huen wrote:
> I've been looking at StAX and the XFF parser sources to try to figure out
> how the StAX framework can be adapted to my purposes (parse GAME files).

Yes, it definitely can be.  And I think quite a few of use
would love to see a GAME parser :).

> I've come across the DelegationManager interface but I can't seem to find
> a class that implements it in biojava.  Who supplies the DelegationManager
> functionality and how does it work?

DelegationManager implementations are provided by the StAX
event source (probably SAX2StAXAdaptor).  They tend to be
really simple classes, no real complexity.

The basic premise of StAX is that each startElement event
comes with a DelegationManager.  The content handler
receiving the event can use the DelegationManager.delegate
method to pass on the startElement event (and the corresponding
sub-tree of events) to another content handler.

You should never see DelegationManagers outside of that
context, and you shouldn't need to directly workk with
DelegationManager implementations unless you write your
own StAX data source.

> The other question I am confused about is how the XFF parser returns a
> possibly compound feature of the biojava type?  I see that a class
> variable in FeatureHandler than holds the template and other derived
> classes that handle particular data types that themselves have templates
> but where/how in the sources do these features get attached to the
> root feature?  What am I missing here?

The XFF parser sends events to a BioJava SeqIOListener.  Feature
hierarchies are represented using something like:

startSequence()
  startFeature(my_gene)
    startFeature(my_exon_1)
    endFeature(my_exon_1)
    startFeature(my_exon_2)
    endFeature(my_exon_2)
  endFeature(my_gene)
endSequence()

In other words, really rather similar to the way XML elements
are represented in the SAX/StAX APIs.  So it turns out that
the XFF parser handles feature hierarchies almost for free.

> Should I be using StAX yet or is it better at this stage to stay with SAX
> itself?

The StAX APIs haven't been used massively widely yet, but I've
written several applications with them now, and Matthew has done
some too.  There has been a little bit of fine tuning, but I
think the interfaces should now be pretty stable.  I'd like to
publish a little paper on one of the XML websites explaining
StAX properly, but I haven't yet had the time to write it.

For the benefit of those who haven't looked at it yet, StAX is
a small addition to the SAX API.  The StAXContentHandler interface
looks almost identical to the normal SAX ContentHandler, but
with one small addition: any element in the XML document
(and it's sub-tree of elements) can be delegated to another
StAXContentHandler.  This small change makes a big difference:
you can easily write modular (and even extensible) parsers.
I've found that even for quite simple applications, it makes
life a lot easier that using vanilla SAX.


David: yes, I would recommend StAX, if you like the idea
of writing modular parsers.  Feel free to let me know if
you've got any questions.  And keep pestering me to write
the paper ;).


    Thomas.