Bioperl: Parsing BLAST output file problem

Brian K. Pollock bpollock@resgen.com
Mon, 28 Dec 1998 12:33:14 -0600 (CST)


Hi everyone,

I am having difficulty parsing an existing BLAST report and was hoping
someone could shed some light on this.

Below is the sub that I am having difficulty with (a slightly modified
version out of Steve's blast_config.pl).

Section 1 : The Sub
*******************

sub CreateBlast  {
    $file = shift;
    my $blast_obj;

    eval {
        $countBlast++;
        $blast_obj = new Bio::Tools::Blast ( -file   => $file,
                                             -parse => 1,
                                             -signif => '1e-30'
                                           );
    };

    if($@) {
        croak "\n*** Trouble creating Blast object:\n$@\n\n";
    }

    return $blast_obj;
}


Section 2: Here is the exception that is being thrown.
******************************************************

-------------------- EXCEPTION --------------------
MSG: Failed to create any Sbjct objects for 172 potential hit(s).
NOTE: 

TRAPPED EXCEPTION(S):
SHOWING FIRST EXCEPTION ONLY:

-------------------- EXCEPTION --------------------
MSG: Can't parse HSP data for gi|2154132|gb|AA442254|AA442254.
NOTE: Missing HSP data or unrecognized format.
CONTEXT: Error in object Bio::Tools::Blast "(339"
STACK: 
Bio::Tools::Blast::_parse_hsp_data(2656)
Bio::Tools::Blast::_set_hits(2487)
Bio::Tools::Blast::_parse(1504)
Bio::Tools::Blast::parse(1450)
Bio::Tools::SeqAnal::_initialize(291)
Bio::Tools::Blast::_initialize(1027)
Bio::Root::Object::new(352)
main::CreateBlast(59)
main::./hsv_parser.pl(42)
---------------------------------------------------

END TRAPPED EXCEPTION(S)

CONTEXT: Error in object Bio::Tools::Blast "(339"
STACK: 
Bio::Tools::Blast::_set_hits(2544)
Bio::Tools::Blast::_parse(1504)
Bio::Tools::Blast::parse(1450)
Bio::Tools::SeqAnal::_initialize(291)
Bio::Tools::Blast::_initialize(1027)
Bio::Root::Object::new(352)
main::CreateBlast(59)
main::./hsv_parser.pl(42)


Section 3 : Header from a BLAST report.
****************************************

BLASTN 2.0.2 [Sep-3-1997]

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped
BLAST and PSI-BLAST: a new generation of protein database search
programs",
Nucleic Acids Res. 25:3389-3402.

Query= 
         (339 letters)

Database: dbEST
           1,294,320 sequences; 477,413,560 total letters

Searching..................................................done
                                                                    High
E
Sequences producing significant alignments:                        Score
Value

gi|2154132|gb|AA442254|AA442254 zv61g11.s1 Soares testis NHT Ho...   391
e-107
gi|564863|emb|Z40247|HSC1XA082 H. sapiens partial cDNA sequence...   391
e-107


 -------------------------- SNIP -------------------------------------

Section 4 : Sample HSP summary snippet:
***************************************

> gi|2154132|gb|AA442254|AA442254 zv61g11.s1 Soares testis NHT Homo
          sapiens cDNA clone 758180 3'
          Length = 444
          
Score =  391 bits (197), Expect = e-107
Identities = 218/227 (96%), Positives = 218/227 (96%)

                                                                      
Query 113 aacctnggcaccgttnctttttattatagctaaacncaanatgcccaaanatatacaaaa 172
          ||||| ||||||||| |||||||||||||||| || ||| ||||||||| ||||||||||
Sbjct 4   aaccttggcaccgtttctttttattatagctagacacaagatgcccaaagatatacaaaa 63

                                                                      
Query 173 caaacaatacaaattttaaacacttttacaaaggtgacaatataaanaaatgaaaactat 232
          |||||||||||||||||||||||||||||||||||||||||||||| ||||||| |||||
Sbjct 64  caaacaatacaaattttaaacacttttacaaaggtgacaatataaagaaatgaagactat 123

                                                                      
Query 233 ggatttatgtttgcaaacattaatttcctttgaatactgtctggtgttaaaacctaacag 292
          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 124 ggatttatgtttgcaaacattaatttcctttgaatactgtctggtgttaaaacctaacag 183

                                                         
Query 293 tttaccaaaatttgtatttgacacaactatttttaaaattatagtgg 339
          ||||||| |||||||||||||||||||||||||||||||||||||||
Sbjct 184 tttaccagaatttgtatttgacacaactatttttaaaattatagtgg 230



Thanks for any help,
Brian



------------------------------------------------------
Brian K. Pollock
BioInformatics Group
Research Genetics Inc.
(800)533-4363 X2266
bpollock@resgen.com
http://www.resgen.com
-------------------------------------------------------

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================