[Bioperl-l] Blast parsing question

Chris Strassel CStrassel@genomecorp.com
Thu, 25 Apr 2002 13:04:56 -0400


Got a message that this is fixed in CVS, and indeed, when I checked out the latest version the problem went away (Thanks Jason).

Chris

-----Original Message-----
From: Mick Watson [mailto:michaelwatson@paradigm-therapeutics.co.uk] 
Sent: Thursday, April 25, 2002 9:32 AM
To: Chris Strassel
Cc: bioperl-l@bioperl.org
Subject: Re: [Bioperl-l] Blast parsing question


I know it's not a long term solution, but did u take the warning line out of the blast output and then see if the parser works fine then? That should tell us at least if it is the warning line or if it is something else

Mick

Chris Strassel wrote:

> Not sure if this is a known problem or not (I've been out of touch for 
> a while).
>
> I'm trying to parse the following blast output:
>
> ...                                                                    Sum
>                                                               High  Probability
> Sequences producing High-scoring Segment Pairs:              Score  P(N)      N
>
> [GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [D...   996  7.4e-40   1
> [GI:9392631] [LN:AF257078] [AC:AF257078] [OR:Homo sapiens...   462  2.2e-15   1
> [GI:2687802] [LN:HS179N16T] [AC:AL020972] [OR:Homo sapien...   427  4.4e-14   1
> [GI:605020] [LN:HUMUT6640] [AC:L30574] [OR:Homo sapiens] ...   396  3.4e-13   1
> [GI:3168695] [LN:G38121] [AC:G38121] [OR:Homo sapiens] [D...   388  8.7e-13   1
> [GI:605053] [LN:HUMUT7241] [AC:L30409] [OR:Homo sapiens] ...   379  2.4e-12   1
> [GI:2734402] [LN:G36735] [AC:G36735] [OR:Homo sapiens] [D...   377  3.4e-12   1
> [GI:1113732] [LN:HUMSWS3328] [AC:G13119] [OR:Homo sapiens...   366  1.4e-11   1
> [GI:12025508] [LN:G67450] [AC:G67450] [OR:Homo sapiens] [...   362  1.9e-11   1
> [GI:2996755] [LN:G37104] [AC:G37104] [OR:Homo sapiens] [D...   352  6.0e-11   1
> [GI:1052377] [LN:HSB311WC5] [AC:Z67594] [OR:Homo sapiens]...   354  6.5e-11   1
> [GI:6124528] [LN:G59359] [AC:G59359] [OR:Homo sapiens] [D...   341  1.3e-10   1
> [GI:7161555] [LN:HSC60H06] [AC:AL158439] [OR:Homo sapiens...   338  2.6e-10   1
> [GI:308693] [LN:HUMUT887] [AC:L18457] [OR:Homo sapiens] [...   321  1.0e-09   1
> [GI:1526788] [LN:G28895] [AC:G28895] [OR:Homo sapiens] [D...   324  1.3e-09   1
> [GI:9794942] [LN:G66526] [AC:G66526] [OR:Homo sapiens] [D...   313  3.4e-09   1
> [GI:1396225] [LN:G27506] [AC:G27506] [OR:Homo sapiens] [D...   315  3.7e-09   1
> [GI:1347049] [LN:G24817] [AC:G24817] [OR:Homo sapiens] [D...   315  5.4e-09   1
> [GI:6124561] [LN:G59392] [AC:G59392] [OR:Homo sapiens] [D...   300  1.5e-08   1
> [GI:938428] [LN:G07878] [AC:G07878] [OR:Homo sapiens] [DE...   300  1.6e-08   1
>
> WARNING:  Descriptions of 16 database sequences were not reported due to the
>           limiting value of parameter V = 20.
>
> >[GI:1131572] [LN:G14809] [AC:G14809] [OR:Homo sapiens] [DE:SHGC-13583 
> >Human
>             Homo sapiens STS genomic, sequence tagged site] [KW:STS]
>             [PT:Unpublished, Olivier, M., Cox, D.R. (2000)] [JO:Unpublished]
>             [DB:genabnk-sts1]
>         Length = 250
>
>   Minus Strand HSPs:
>
>  Score = 996 (155.5 bits), Expect = 7.4e-40, P = 7.4e-40  Identities = 
> 200/202 (99%), Positives = 200/202 (99%), Strand = Minus / Plus
>
> Query:   202 TTGCATATGGACATACAATTGTTCTAGAATCATTTGTTGAAAAGGTTGTCCATTCTCCAC 143
>              |||||||||||||||||||| |||||||||||||| ||||||||||||||||||||||||
> Sbjct:     1 TTGCATATGGACATACAATTNTTCTAGAATCATTTNTTGAAAAGGTTGTCCATTCTCCAC 60
>
> ...
>
> Using the following code:
>     my $blast = Bio::SearchIO->new('-format' => 'blast',
>                                    '-file'   => $file);
>
>     # Now get the best hit from the blast search and check
>     # to see whether its score meets the criteria for keeping.
>     my $result = $blast->next_result;
>
>     my $hit = $result->next_hit;
>     my $name = $hit->name;
>
>     print "For file $file, the best hit is $name.\n";
>
> And I get this error:
> ------------- EXCEPTION  -------------
> MSG: no data for midline WARNING:  HSPs involving 16 database 
> sequences were not reported due to the STACK 
> Bio::SearchIO::blast::next_result /usr/home/cstrasse/src/bioperl-1.0/Bio/SearchIO/blast.pm:486
> STACK main::check_seq ../add_blast.pl:88
> STACK toplevel ../add_blast.pl:40
>
> --------------------------------------
>
> Looks like the parser doesn't like the Warning line, but for my 
> purposes the blast report is fine.  Any suggestions?
>
> Thanks,
> Chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org http://bioperl.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org http://bioperl.org/mailman/listinfo/bioperl-l