[Bioperl-l] Blast parsing question

Jason Stajich jason@cgt.mc.duke.edu
Thu, 25 Apr 2002 08:58:18 -0400 (EDT)


This is fixed in CVS live code but not in the release - sorry I wasn't
initiated
to the full breadth of wublast format fun.

-jason

On Thu, 25 Apr 2002, Chris Strassel wrote:

> Not sure if this is a known problem or not (I've been out of touch for a while).
>
> I'm trying to parse the following blast output:
>
> ...                                                                    Sum
>                                                               High  Probability
> Sequences producing High-scoring Segment Pairs:              Score  P(N)      N
>
> [GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [D...   996  7.4e-40   1
> [GI:9392631] [LN:AF257078] [AC:AF257078] [OR:Homo sapiens...   462  2.2e-15   1
> [GI:2687802] [LN:HS179N16T] [AC:AL020972] [OR:Homo sapien...   427  4.4e-14   1
> [GI:605020] [LN:HUMUT6640] [AC:L30574] [OR:Homo sapiens] ...   396  3.4e-13   1
> [GI:3168695] [LN:G38121] [AC:G38121] [OR:Homo sapiens] [D...   388  8.7e-13   1
> [GI:605053] [LN:HUMUT7241] [AC:L30409] [OR:Homo sapiens] ...   379  2.4e-12   1
> [GI:2734402] [LN:G36735] [AC:G36735] [OR:Homo sapiens] [D...   377  3.4e-12   1
> [GI:1113732] [LN:HUMSWS3328] [AC:G13119] [OR:Homo sapiens...   366  1.4e-11   1
> [GI:12025508] [LN:G67450] [AC:G67450] [OR:Homo sapiens] [...   362  1.9e-11   1
> [GI:2996755] [LN:G37104] [AC:G37104] [OR:Homo sapiens] [D...   352  6.0e-11   1
> [GI:1052377] [LN:HSB311WC5] [AC:Z67594] [OR:Homo sapiens]...   354  6.5e-11   1
> [GI:6124528] [LN:G59359] [AC:G59359] [OR:Homo sapiens] [D...   341  1.3e-10   1
> [GI:7161555] [LN:HSC60H06] [AC:AL158439] [OR:Homo sapiens...   338  2.6e-10   1
> [GI:308693] [LN:HUMUT887] [AC:L18457] [OR:Homo sapiens] [...   321  1.0e-09   1
> [GI:1526788] [LN:G28895] [AC:G28895] [OR:Homo sapiens] [D...   324  1.3e-09   1
> [GI:9794942] [LN:G66526] [AC:G66526] [OR:Homo sapiens] [D...   313  3.4e-09   1
> [GI:1396225] [LN:G27506] [AC:G27506] [OR:Homo sapiens] [D...   315  3.7e-09   1
> [GI:1347049] [LN:G24817] [AC:G24817] [OR:Homo sapiens] [D...   315  5.4e-09   1
> [GI:6124561] [LN:G59392] [AC:G59392] [OR:Homo sapiens] [D...   300  1.5e-08   1
> [GI:938428] [LN:G07878] [AC:G07878] [OR:Homo sapiens] [DE...   300  1.6e-08   1
>
>
> WARNING:  Descriptions of 16 database sequences were not reported due to the
>           limiting value of parameter V = 20.
>
>
>
> >[GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [DE:SHGC-13583 Human
>             Homo sapiens STS genomic, sequence tagged site] [KW:STS]
>             [PT:Unpublished, Olivier, M., Cox, D.R. (2000)] [JO:Unpublished]
>             [DB:genabnk-sts1]
>         Length = 250
>
>   Minus Strand HSPs:
>
>  Score = 996 (155.5 bits), Expect = 7.4e-40, P = 7.4e-40
>  Identities = 200/202 (99%), Positives = 200/202 (99%), Strand = Minus / Plus
>
> Query:   202 TTGCATATGGACATACAATTGTTCTAGAATCATTTGTTGAAAAGGTTGTCCATTCTCCAC 143
>              |||||||||||||||||||| |||||||||||||| ||||||||||||||||||||||||
> Sbjct:     1 TTGCATATGGACATACAATTNTTCTAGAATCATTTNTTGAAAAGGTTGTCCATTCTCCAC 60
>
> ...
>
> Using the following code:
>     my $blast = Bio::SearchIO->new('-format' => 'blast',
> 				   '-file'   => $file);
>
>     # Now get the best hit from the blast search and check
>     # to see whether its score meets the criteria for keeping.
>     my $result = $blast->next_result;
>
>     my $hit = $result->next_hit;
>     my $name = $hit->name;
>
>     print "For file $file, the best hit is $name.\n";
>
> And I get this error:
> ------------- EXCEPTION  -------------
> MSG: no data for midline WARNING:  HSPs involving 16 database sequences were not reported due to the
> STACK Bio::SearchIO::blast::next_result /usr/home/cstrasse/src/bioperl-1.0/Bio/SearchIO/blast.pm:486
> STACK main::check_seq ../add_blast.pl:88
> STACK toplevel ../add_blast.pl:40
>
> --------------------------------------
>
> Looks like the parser doesn't like the Warning line, but for my purposes the blast report is fine.  Any suggestions?
>
> Thanks,
> Chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu