[Bioperl-l] parsing only the summary part of a blast report

Zhao, David [PRI] DZhao1@prius.jnj.com
Thu, 12 Oct 2000 12:58:37 -0400


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C0346D.A9DA4EB0
Content-Type: text/plain

Does your version of BPlite parse HTML formatted blast report?

> -----Original Message-----
> From:	Ian Korf [SMTP:ikorf@sapiens.wustl.edu]
> Sent:	Wednesday, October 11, 2000 8:40 AM
> To:	Bioperl
> Subject:	Re: [Bioperl-l] parsing only the summary part of a blast
> report
> 
> The latest version of BPlite also reads concatenated blast reports. So you
> can do the following:
> 
> my $multi = new BPlite::Multi(\*FILEHANDLE);
> while (my $blast = $multi->nextBlast) {
>     while (my $sbjct = $blast->nextSbjct) {
>         print "$sbjct\n";
>     }
> }
> 
> Unfortunately, BPlite has two development branches, the bioperl one and my
> own (sorry about that). But the new code should be trivial to migrate.
> 
> -Ian
> 
> On Wed, 11 Oct 2000, Jason Stajich wrote:
> 
> > see the perldoc for Bio::Tools::BPlite for all the complete api.
> > 
> > This script will parse and print all of the hits in your report.
> > 
> > #!/usr/local/bin/perl -w
> > 
> > use Bio::Tools::BPlite;
> > 
> > my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
> > print $report->query(), ",", $report->database(), "\n";
> > while ( my $sbjct = $report->nextSbjct ) {
> >     print "name is ", $sbjct->name(), "\n";
> > }
> > 
> > Alternatively you can use Bio::Tools::Blast, but you will find it slower
> > and more memory intensive.  It does support more features so it depends
> on
> > what your needs are.
> > 
> > On Wed, 11 Oct 2000, Hilmar Lapp wrote:
> > 
> > > > "Zhao, David [PRI]" wrote:
> > > > 
> > > > Hi there,
> > > > How can I parse the summary part of a blast report using bioperl
> > > > modules? such as:
> > > > 
> > > 
> > > I'm almost sure you can't using Blast.pm. Maybe you can with BPlite
> > > (development trunk only). I know that Blast.pm takes a signifant time
> to
> > > parse long reports (i.e., with  hundreds of alignments), but we
> haven't
> > > checked yet whether BPlite is significantly faster in such cases. I
> guess
> > > you're asking because you bother about the time lost in parsing the
> > > alignments, although you needed only the summary data which are
> already
> > > present in the hit list.
> > > 
> > > BTW you can pass a significance threshold to Blast.pm, and although
> I'm
> > > not sure I think it won't parse those alignments beyond the
> significance
> > > threshold.
> > 
> > [Regarding Bio::Tools::Blast]
> > 
> > I'm pretty sure the signifigance threshold only applies to when you are
> > running Blast not parsing a report.  In fact you should not put a
> > signif=>$value in your parameter hash if you are just parsing a report
> > file.
> > 
> > > 
> > > 	Hilmar
> > > -- 
> > > -----------------------------------------------------------------
> > > Hilmar Lapp                                email: hlapp@gmx.net
> > > NFI Vienna, IFD/Bioinformatics             phone: +43 1 86634 631
> > > A-1235 Vienna                                fax: +43 1 86634 727
> > > -----------------------------------------------------------------
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-l
> > > 
> > 
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > http://galton.mc.duke.edu/~jason/
> > (919)684-1806 (office) 
> > (919)684-2275 (fax) 
> > Center for Human Genetics - Duke University Medical Center
> > http://wwwchg.mc.duke.edu/ 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> > 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l

------_=_NextPart_001_01C0346D.A9DA4EB0
Content-Type: text/html
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2651.75">
<TITLE>RE: [Bioperl-l] parsing only the summary part of a blast =
report</TITLE>
</HEAD>
<BODY>

<P><FONT COLOR=3D"#000080" FACE=3D"Comic Sans MS">Does your version of =
BPlite parse HTML formatted blast report?</FONT>
</P>

<P><FONT SIZE=3D1 FACE=3D"Arial">-----Original Message-----</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">From:&nbsp;&nbsp;</FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Ian Korf [SMTP:ikorf@sapiens.wustl.edu]</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">Sent:&nbsp;&nbsp;</FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Wednesday, October 11, 2000 8:40 AM</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">To:&nbsp;&nbsp;&nbsp;&nbsp;</FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">Bioperl</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT>=
</B> <FONT SIZE=3D1 FACE=3D"Arial">Re: [Bioperl-l] parsing only the =
summary part of a blast report</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">The latest version of BPlite also =
reads concatenated blast reports. So you</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">can do the following:</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">my $multi =3D new =
BPlite::Multi(\*FILEHANDLE);</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">while (my $blast =3D =
$multi-&gt;nextBlast) {</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&nbsp;&nbsp;&nbsp; while (my $sbjct =
=3D $blast-&gt;nextSbjct) {</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; print =
&quot;$sbjct\n&quot;;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&nbsp;&nbsp;&nbsp; }</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">}</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">Unfortunately, BPlite has two =
development branches, the bioperl one and my</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">own (sorry about that). But the new =
code should be trivial to migrate.</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">-Ian</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">On Wed, 11 Oct 2000, Jason Stajich =
wrote:</FONT>
</P>

<P><FONT SIZE=3D2 FACE=3D"Arial">&gt; see the perldoc for =
Bio::Tools::BPlite for all the complete api.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; This script will parse and print =
all of the hits in your report.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; #!/usr/local/bin/perl -w</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; use Bio::Tools::BPlite;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; my $report =3D new =
Bio::Tools::BPlite(-fh=3D&gt;\*STDIN);</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; print $report-&gt;query(), =
&quot;,&quot;, $report-&gt;database(), &quot;\n&quot;;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; while ( my $sbjct =3D =
$report-&gt;nextSbjct ) {</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt;&nbsp;&nbsp;&nbsp;&nbsp; print =
&quot;name is &quot;, $sbjct-&gt;name(), &quot;\n&quot;;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; }</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; Alternatively you can use =
Bio::Tools::Blast, but you will find it slower</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; and more memory intensive.&nbsp; =
It does support more features so it depends on</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; what your needs are.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; On Wed, 11 Oct 2000, Hilmar Lapp =
wrote:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; &quot;Zhao, David =
[PRI]&quot; wrote:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; Hi there,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; How can I parse the =
summary part of a blast report using bioperl</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; modules? such =
as:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; I'm almost sure you can't =
using Blast.pm. Maybe you can with BPlite</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; (development trunk only). I =
know that Blast.pm takes a signifant time to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; parse long reports (i.e., =
with&nbsp; hundreds of alignments), but we haven't</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; checked yet whether BPlite =
is significantly faster in such cases. I guess</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; you're asking because you =
bother about the time lost in parsing the</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; alignments, although you =
needed only the summary data which are already</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; present in the hit =
list.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; BTW you can pass a =
significance threshold to Blast.pm, and although I'm</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; not sure I think it won't =
parse those alignments beyond the significance</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; threshold.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; [Regarding =
Bio::Tools::Blast]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; I'm pretty sure the signifigance =
threshold only applies to when you are</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; running Blast not parsing a =
report.&nbsp; In fact you should not put a</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; signif=3D&gt;$value in your =
parameter hash if you are just parsing a report</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; file.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; &nbsp;&nbsp;&nbsp; =
Hilmar</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; -- </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; =
-----------------------------------------------------------------</FONT>=

<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; Hilmar =
Lapp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; email: =
hlapp@gmx.net</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; NFI Vienna, =
IFD/Bioinformatics&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp; phone: +43 1 86634 631</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; A-1235 =
Vienna&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fax: +43 1 86634 =
727</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; =
-----------------------------------------------------------------</FONT>=

<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; Bioperl-l mailing =
list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; =
Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; <A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; &gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; Jason Stajich</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; jason@chg.mc.duke.edu</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; <A =
HREF=3D"http://galton.mc.duke.edu/~jason/" =
TARGET=3D"_blank">http://galton.mc.duke.edu/~jason/</A></FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; (919)684-1806 (office) </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; (919)684-2275 (fax) </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; Center for Human Genetics - Duke =
University Medical Center</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; <A =
HREF=3D"http://wwwchg.mc.duke.edu/" =
TARGET=3D"_blank">http://wwwchg.mc.duke.edu/</A> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; <A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
<BR><FONT SIZE=3D2 FACE=3D"Arial">&gt; </FONT>
</P>

<P><FONT SIZE=3D2 =
FACE=3D"Arial">_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"><A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C0346D.A9DA4EB0--