[Biojava-l] Re: unexpected behavior in StAX, StringElementHandlerBase

Thomas Down td2@sanger.ac.uk
Tue, 18 Sep 2001 20:52:51 +0100


On Tue, Sep 18, 2001 at 03:12:20PM -0400, Michael L. Heuer wrote:
> 
> On Tue, 18 Sep 2001, Christopher Pickslay wrote:
> 
> > All of the StAXContentHandlerBase implementations appear to share this
> > problem. They each contain a StringBuffer field that is initialized along
> > with the class, rather than in startElement(). So not only will they cause
> > this behavior with sequential elements, but it means a
> > StAXContentHandlerBase instance cannot be re-used. I'd recommend
> > initializing the StringBuffer in startElement(), as that would be the
> > expected behavior.
> 
> is it better to re-initialize the StringBuffer, or call
> sb.delete(0,sb.length()) after calling setXValue(String)?
> 
> I don't have a good profiling tool, so I'm just guessing here...


I'd presume that sb.setLength(0) is fastest.  But I doubt there's
much difference really...

> > As to how to treat child elements, I'd recommend having
> > StringElementHandlerBase throw an exception (as it currently does). But it
> > would be a good idea to add a MixedElementHandlerBase class which allows
> > for the retrieval of character data and allows the implementor to handle
> > delegation for child elements.
> 
> I like this idea.
> 
> If you're dealing with an xml document that has been validated against a
> DTD or a schema, you shouldn't need the mixed implementation, but it
> seems like it'd be nice to have.

Both DTDs and Schemas do have a mixed content model.  This is
how <span class="foo">blah<span class="bar">blah</span>blah</span>
is legal (and validatable!) [X]HTML.

I tend to feel mixed models are usually a bad idea in data-oriented
XML (that's why I never thought about them when I was writing
StringElementHAndlerBase), but they are allowed.

   Thomas.