[Biojava-l] A problem parsing Blast XML output (blastN vs. blastP)

Benoit VARVENNE varvenne at genoway.com
Fri Nov 24 17:53:49 UTC 2006


Le 24/11/06 18:08, « David Huen » <smh1008 at cam.ac.uk> a écrit :

> On Nov 24 2006, Benoit VARVENNE wrote:
> 
>> Hello,
>> 
>> I'm parsing blast results using biojava1.5 and a BlastXMLParserFacade with
>> the code put at the end of this mail.
>> 
>> I've tried this with a blastN query and there i got no trouble.
>> However, i've tried to do exactly the same thing with a BlastP query and
>> i've got the exception cited at the end of this mail.
>> 
>> I've verified and the two infiles (blastn/blastp) seem to have the same
>> structures (except that one is for prot so data are different). (Please
>> find them as attached if you're used to this).
>> 
>> Can someone help me ? I don't understand why it works in a case and not in
>> the other one ...
>> 
> I am uncertain whether BlastXMLFacade will actually support a protein
> sequence parse. It was originally developed to handle blastn. Anyone else
> tried it with blastp?
> 
> I'm offline till Sunday so I can't reply till then.
> 
> Regards,
> David Huen
> 

David,

As BlastLikeSAXParser seemed to support older versions of NCBI blastP (see
"http://www.biojava.org/wiki/BioJava:Tutorial:Blast-like_Parsing_Cook_Book#S
tep_A_-_Create_an_application_that_sets_up_the_parser_and_does_the_parsing")
,  i'll be surprised if BlastXMLParserFacade does not.

However, i'll be very interested if someone's got any information.

Can't it be dtd problem ? If yes, can we update dtd sources ?

Cheers,
Benoît.






More information about the Biojava-l mailing list