[Bioperl-l] Blast parsing question

Chris Strassel CStrassel@genomecorp.com
Thu, 25 Apr 2002 08:40:27 -0400


Not sure if this is a known problem or not (I've been out of touch for a while).

I'm trying to parse the following blast output:

...                                                                    Sum
                                                              High  Probability
Sequences producing High-scoring Segment Pairs:              Score  P(N)      N

[GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [D...   996  7.4e-40   1
[GI:9392631] [LN:AF257078] [AC:AF257078] [OR:Homo sapiens...   462  2.2e-15   1
[GI:2687802] [LN:HS179N16T] [AC:AL020972] [OR:Homo sapien...   427  4.4e-14   1
[GI:605020] [LN:HUMUT6640] [AC:L30574] [OR:Homo sapiens] ...   396  3.4e-13   1
[GI:3168695] [LN:G38121] [AC:G38121] [OR:Homo sapiens] [D...   388  8.7e-13   1
[GI:605053] [LN:HUMUT7241] [AC:L30409] [OR:Homo sapiens] ...   379  2.4e-12   1
[GI:2734402] [LN:G36735] [AC:G36735] [OR:Homo sapiens] [D...   377  3.4e-12   1
[GI:1113732] [LN:HUMSWS3328] [AC:G13119] [OR:Homo sapiens...   366  1.4e-11   1
[GI:12025508] [LN:G67450] [AC:G67450] [OR:Homo sapiens] [...   362  1.9e-11   1
[GI:2996755] [LN:G37104] [AC:G37104] [OR:Homo sapiens] [D...   352  6.0e-11   1
[GI:1052377] [LN:HSB311WC5] [AC:Z67594] [OR:Homo sapiens]...   354  6.5e-11   1
[GI:6124528] [LN:G59359] [AC:G59359] [OR:Homo sapiens] [D...   341  1.3e-10   1
[GI:7161555] [LN:HSC60H06] [AC:AL158439] [OR:Homo sapiens...   338  2.6e-10   1
[GI:308693] [LN:HUMUT887] [AC:L18457] [OR:Homo sapiens] [...   321  1.0e-09   1
[GI:1526788] [LN:G28895] [AC:G28895] [OR:Homo sapiens] [D...   324  1.3e-09   1
[GI:9794942] [LN:G66526] [AC:G66526] [OR:Homo sapiens] [D...   313  3.4e-09   1
[GI:1396225] [LN:G27506] [AC:G27506] [OR:Homo sapiens] [D...   315  3.7e-09   1
[GI:1347049] [LN:G24817] [AC:G24817] [OR:Homo sapiens] [D...   315  5.4e-09   1
[GI:6124561] [LN:G59392] [AC:G59392] [OR:Homo sapiens] [D...   300  1.5e-08   1
[GI:938428] [LN:G07878] [AC:G07878] [OR:Homo sapiens] [DE...   300  1.6e-08   1


WARNING:  Descriptions of 16 database sequences were not reported due to the
          limiting value of parameter V = 20.



>[GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [DE:SHGC-13583 Human
            Homo sapiens STS genomic, sequence tagged site] [KW:STS]
            [PT:Unpublished, Olivier, M., Cox, D.R. (2000)] [JO:Unpublished]
            [DB:genabnk-sts1]
        Length = 250

  Minus Strand HSPs:

 Score = 996 (155.5 bits), Expect = 7.4e-40, P = 7.4e-40
 Identities = 200/202 (99%), Positives = 200/202 (99%), Strand = Minus / Plus

Query:   202 TTGCATATGGACATACAATTGTTCTAGAATCATTTGTTGAAAAGGTTGTCCATTCTCCAC 143
             |||||||||||||||||||| |||||||||||||| ||||||||||||||||||||||||
Sbjct:     1 TTGCATATGGACATACAATTNTTCTAGAATCATTTNTTGAAAAGGTTGTCCATTCTCCAC 60

...

Using the following code:
    my $blast = Bio::SearchIO->new('-format' => 'blast',
				   '-file'   => $file);

    # Now get the best hit from the blast search and check
    # to see whether its score meets the criteria for keeping.
    my $result = $blast->next_result;

    my $hit = $result->next_hit;
    my $name = $hit->name;

    print "For file $file, the best hit is $name.\n";

And I get this error:
------------- EXCEPTION  -------------
MSG: no data for midline WARNING:  HSPs involving 16 database sequences were not reported due to the
STACK Bio::SearchIO::blast::next_result /usr/home/cstrasse/src/bioperl-1.0/Bio/SearchIO/blast.pm:486
STACK main::check_seq ../add_blast.pl:88
STACK toplevel ../add_blast.pl:40

--------------------------------------

Looks like the parser doesn't like the Warning line, but for my purposes the blast report is fine.  Any suggestions?

Thanks,
Chris