[Bioperl-l] Blast parsing question
Mick Watson
michaelwatson@paradigm-therapeutics.co.uk
Thu, 25 Apr 2002 14:32:22 +0100
I know it's not a long term solution, but did u take the warning line out of the blast output and then see if the
parser works fine then?
That should tell us at least if it is the warning line or if it is something else
Mick
Chris Strassel wrote:
> Not sure if this is a known problem or not (I've been out of touch for a while).
>
> I'm trying to parse the following blast output:
>
> ... Sum
> High Probability
> Sequences producing High-scoring Segment Pairs: Score P(N) N
>
> [GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [D... 996 7.4e-40 1
> [GI:9392631] [LN:AF257078] [AC:AF257078] [OR:Homo sapiens... 462 2.2e-15 1
> [GI:2687802] [LN:HS179N16T] [AC:AL020972] [OR:Homo sapien... 427 4.4e-14 1
> [GI:605020] [LN:HUMUT6640] [AC:L30574] [OR:Homo sapiens] ... 396 3.4e-13 1
> [GI:3168695] [LN:G38121] [AC:G38121] [OR:Homo sapiens] [D... 388 8.7e-13 1
> [GI:605053] [LN:HUMUT7241] [AC:L30409] [OR:Homo sapiens] ... 379 2.4e-12 1
> [GI:2734402] [LN:G36735] [AC:G36735] [OR:Homo sapiens] [D... 377 3.4e-12 1
> [GI:1113732] [LN:HUMSWS3328] [AC:G13119] [OR:Homo sapiens... 366 1.4e-11 1
> [GI:12025508] [LN:G67450] [AC:G67450] [OR:Homo sapiens] [... 362 1.9e-11 1
> [GI:2996755] [LN:G37104] [AC:G37104] [OR:Homo sapiens] [D... 352 6.0e-11 1
> [GI:1052377] [LN:HSB311WC5] [AC:Z67594] [OR:Homo sapiens]... 354 6.5e-11 1
> [GI:6124528] [LN:G59359] [AC:G59359] [OR:Homo sapiens] [D... 341 1.3e-10 1
> [GI:7161555] [LN:HSC60H06] [AC:AL158439] [OR:Homo sapiens... 338 2.6e-10 1
> [GI:308693] [LN:HUMUT887] [AC:L18457] [OR:Homo sapiens] [... 321 1.0e-09 1
> [GI:1526788] [LN:G28895] [AC:G28895] [OR:Homo sapiens] [D... 324 1.3e-09 1
> [GI:9794942] [LN:G66526] [AC:G66526] [OR:Homo sapiens] [D... 313 3.4e-09 1
> [GI:1396225] [LN:G27506] [AC:G27506] [OR:Homo sapiens] [D... 315 3.7e-09 1
> [GI:1347049] [LN:G24817] [AC:G24817] [OR:Homo sapiens] [D... 315 5.4e-09 1
> [GI:6124561] [LN:G59392] [AC:G59392] [OR:Homo sapiens] [D... 300 1.5e-08 1
> [GI:938428] [LN:G07878] [AC:G07878] [OR:Homo sapiens] [DE... 300 1.6e-08 1
>
> WARNING: Descriptions of 16 database sequences were not reported due to the
> limiting value of parameter V = 20.
>
> >[GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [DE:SHGC-13583 Human
> Homo sapiens STS genomic, sequence tagged site] [KW:STS]
> [PT:Unpublished, Olivier, M., Cox, D.R. (2000)] [JO:Unpublished]
> [DB:genabnk-sts1]
> Length = 250
>
> Minus Strand HSPs:
>
> Score = 996 (155.5 bits), Expect = 7.4e-40, P = 7.4e-40
> Identities = 200/202 (99%), Positives = 200/202 (99%), Strand = Minus / Plus
>
> Query: 202 TTGCATATGGACATACAATTGTTCTAGAATCATTTGTTGAAAAGGTTGTCCATTCTCCAC 143
> |||||||||||||||||||| |||||||||||||| ||||||||||||||||||||||||
> Sbjct: 1 TTGCATATGGACATACAATTNTTCTAGAATCATTTNTTGAAAAGGTTGTCCATTCTCCAC 60
>
> ...
>
> Using the following code:
> my $blast = Bio::SearchIO->new('-format' => 'blast',
> '-file' => $file);
>
> # Now get the best hit from the blast search and check
> # to see whether its score meets the criteria for keeping.
> my $result = $blast->next_result;
>
> my $hit = $result->next_hit;
> my $name = $hit->name;
>
> print "For file $file, the best hit is $name.\n";
>
> And I get this error:
> ------------- EXCEPTION -------------
> MSG: no data for midline WARNING: HSPs involving 16 database sequences were not reported due to the
> STACK Bio::SearchIO::blast::next_result /usr/home/cstrasse/src/bioperl-1.0/Bio/SearchIO/blast.pm:486
> STACK main::check_seq ../add_blast.pl:88
> STACK toplevel ../add_blast.pl:40
>
> --------------------------------------
>
> Looks like the parser doesn't like the Warning line, but for my purposes the blast report is fine. Any suggestions?
>
> Thanks,
> Chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l