Bioperl: Yet another question about parsing blast results.
Heil, Jeremy
Jeremy.Heil@celera.com
Fri, 17 Dec 1999 15:25:25 -0500
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01BF48CC.C3CC959A
Content-Type: text/plain;
charset="iso-8859-1"
I have experienced the same problems under the same conditions you state. I
*think* I have the following kludge, let me know if it works for you...
~line: 1900 Blast.pm, _parse_header
...
$data =~ /WARNING: (.+?)$Newline$Newline/so and $self->warn("$1") if
$self->strict;
$data =~ /FATAL: (.+?)$Newline$Newline/so and $self->throw("FATAL BLAST
ERROR = $1");
# No longer throwing exception when no hits were found. Still reporting it.
$data =~ /No hits? found/i and $self->warn("No hits were found.") if
$self->strict;
#****FIX : Problem : exception thrown if last match is a 'no hitter'
if ( $data =~ /No hits found/ && not ( $data =~ /Sequences producing
significant/ ) ) {
return 0;
}
#END FIX
# If this is the first Blast, the program, version, and database info
# pertain to it. Otherwise, they are for the previous report and have
# already been parsed out.
# Data is stored in the static Blast object. Data for subsequent reports
..
Take Care,
Jeremy Heil
SNP Discovery
Celera Genomics
>Oh, and there's one more thing.
>The program has no trouble skipping over 'no hit' reports at the beginning
or
>middle. The problem is if the last report(s) in the file are 'no hit'
reports.
>That's when it croaks and throws an exception, and also ends up missing the
>previous 'hit' that was successful.
>
>
>carl
------_=_NextPart_001_01BF48CC.C3CC959A
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2650.12">
<TITLE>RE: Bioperl: Yet another question about parsing blast results. =
</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=3D2>I have experienced the same problems under the same =
conditions you state. I *think* I have the following kludge, let =
me know if it works for you...</FONT></P>
<P><FONT SIZE=3D2>~line: 1900 Blast.pm, _parse_header</FONT>
</P>
<P><FONT SIZE=3D2>...</FONT>
</P>
<P><FONT SIZE=3D2>$data =3D~ /WARNING: (.+?)$Newline$Newline/so and =
$self->warn("$1") if $self->strict;</FONT>
<BR><FONT SIZE=3D2>$data =3D~ /FATAL: (.+?)$Newline$Newline/so and =
$self->throw("FATAL BLAST ERROR =3D $1"); </FONT>
<BR><FONT SIZE=3D2># No longer throwing exception when no hits were =
found. Still reporting it.</FONT>
<BR><FONT SIZE=3D2>$data =3D~ /No hits? found/i and =
$self->warn("No hits were found.") if $self->strict; =
</FONT>
</P>
<P><FONT SIZE=3D2>#****FIX : Problem : exception thrown if last match =
is a 'no hitter'</FONT>
<BR><FONT SIZE=3D2>if ( $data =3D~ /No hits found/ && not ( =
$data =3D~ /Sequences producing significant/ ) ) {</FONT>
<BR> <FONT SIZE=3D2>return =
0;</FONT>
<BR><FONT SIZE=3D2>}</FONT>
<BR><FONT SIZE=3D2>#END FIX</FONT>
</P>
<P><FONT SIZE=3D2># If this is the first Blast, the program, version, =
and database info</FONT>
<BR><FONT SIZE=3D2># pertain to it. Otherwise, they are for the =
previous report and have</FONT>
<BR><FONT SIZE=3D2># already been parsed out.</FONT>
<BR><FONT SIZE=3D2># Data is stored in the static Blast object. Data =
for subsequent reports</FONT>
</P>
<P><FONT SIZE=3D2>.. </FONT>
</P>
<P><FONT SIZE=3D2>Take Care, </FONT>
</P>
<P> <FONT SIZE=3D2>Jeremy =
Heil </FONT>
<BR> <FONT SIZE=3D2>SNP =
Discovery </FONT>
<BR> <FONT SIZE=3D2>Celera =
Genomics</FONT>
</P>
<P><FONT SIZE=3D2>>Oh, and there's one more thing.</FONT>
<BR><FONT SIZE=3D2>>The program has no trouble skipping over 'no =
hit' reports at the beginning or </FONT>
<BR><FONT SIZE=3D2>>middle. The problem is if the last report(s) in =
the file are 'no hit' reports.</FONT>
<BR><FONT SIZE=3D2>>That's when it croaks and throws an exception, =
and also ends up missing the</FONT>
<BR><FONT SIZE=3D2>>previous 'hit' that was successful.</FONT>
<BR><FONT SIZE=3D2>></FONT>
<BR><FONT SIZE=3D2>></FONT>
<BR><FONT SIZE=3D2>>carl</FONT>
</P>
</BODY>
</HTML>
------_=_NextPart_001_01BF48CC.C3CC959A--
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================