[Biojava-l] A problem parsing Blast XML output (blastN vs. blastP)

Benoit VARVENNE varvenne at genoway.com
Fri Nov 24 17:53:49 UTC 2006

Le 24/11/06 18:08, « David Huen » <smh1008 at cam.ac.uk> a écrit :

> On Nov 24 2006, Benoit VARVENNE wrote:
>> Hello,
>> I'm parsing blast results using biojava1.5 and a BlastXMLParserFacade with
>> the code put at the end of this mail.
>> I've tried this with a blastN query and there i got no trouble.
>> However, i've tried to do exactly the same thing with a BlastP query and
>> i've got the exception cited at the end of this mail.
>> I've verified and the two infiles (blastn/blastp) seem to have the same
>> structures (except that one is for prot so data are different). (Please
>> find them as attached if you're used to this).
>> Can someone help me ? I don't understand why it works in a case and not in
>> the other one ...
> I am uncertain whether BlastXMLFacade will actually support a protein
> sequence parse. It was originally developed to handle blastn. Anyone else
> tried it with blastp?
> I'm offline till Sunday so I can't reply till then.
> Regards,
> David Huen


As BlastLikeSAXParser seemed to support older versions of NCBI blastP (see
,  i'll be surprised if BlastXMLParserFacade does not.

However, i'll be very interested if someone's got any information.

Can't it be dtd problem ? If yes, can we update dtd sources ?


More information about the Biojava-l mailing list