[Biojava-l] BlastXMLParserFacade

Thomas Down td2 at sanger.ac.uk
Fri Mar 12 07:27:02 EST 2004


On 12 Mar 2004, at 12:03, David Huen wrote:

> On Friday 12 Mar 2004 11:17 am, Matthew Pocock wrote:
>> Hi,
>>
>> The parser shouldn't be throwing a NPE. However, the NCBI blast xml
>> output didn't used to be well-formed XML, so was not parseable by any
>> XML parser. I don't know if this has since been fixed by the NCBI.
>>
> In this case, that's not the problem.  The problem started when we 
> switched
> from the Xerces parsers to Sun's own and no attempt on my part was
> successful in getting it it use the NCBI's DTD.  Part of the problem is
> that DTD from NCBI refers to other DTDs and those references are not
> complete so my code used to explicitly resolve them but with Sun's 
> parser,
> I could never convince it to use the provided DTDs.
>
> I don't have time to return to that problem for a while so perhaps 
> someone
> else more familiar with Crimson could figure it out.  Alternatively, we
> could hack it to not use DTDs at all (totally non-validating).

We're not really using Sun's parser specifically -- just the default 
JAXP parser that's installed on the system.  It ought to be possible to 
make Xerces the latest JAXP parser, although I think that this means 
putting it on the bootstrap classpath.  I'll try this out.

A while ago I was under the distinct impression that Sun were going to 
junk Crimson and make Xerces the standard parser.  Looks like this 
hasn't happened though -- I don't know why.

      Thomas.



More information about the Biojava-l mailing list