[Biojava-l] Parsing a BLAST file

Susan Glass SGlass@genetics.com
Fri, 16 Nov 2001 09:55:16 -0500


     I am having a problem getting e-values back from hits and subhits of SequenceDBSearchResults after parsing a file and converting it to biojava objects using a BlastLikeSearchBuilder.  Scores come back okay, but e-values are always NaN (even after updating to account for the "e-121" formats).  Also, there doesn't seem to be a way to get the sequence length back from a hit (NOT the length of the match regions, but the length of the whole thing).  Any thoughts? I am still using the ftp-ed library so if any changes were made in CVS I don't have it.

Thanks a lot,
Susan

>>> Keith James <kdj@sanger.ac.uk> 11/15 4:34 AM >>>
>>>>> "Susan" == Susan Glass <SGlass@genetics.com> writes:

    Susan>      Thanks to David and Keith for help with the BLAST file
    Susan> parsing.  I have now managed to download the ssbind package
    Susan> and run David's demo program.  It's a big help.  Thanks a
    Susan> lot.  I did have to alter the BlastLikeSearchBuilder class
    Susan> slightly because of a problem with the makeSubHit() method
    Susan> throwing an exception when it encountered the float "e-121"
    Susan> (instead of 1e-121) in David's sample BLAST output file.
    Susan> Easily altered, but I wonder if anyone else had this
    Susan> problem.

Ah, yes. Thanks for reporting that. I had fixed it in at least one of
the other builders. I need to add a Blast report which contains this
type of E-value to the tests (e-121... yuck... why can't it be
consistent?!)

I'll patch and test as soon as I can.

-- 

-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
Pathogen Sequencing Unit, Wellcome Trust Sanger Institute, Cambridge, UK