[Bioperl-l] parsing only the summary part of a blast report
Zhao, David [PRI]
DZhao1@prius.jnj.com
Thu, 12 Oct 2000 12:58:37 -0400
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C0346D.A9DA4EB0
Content-Type: text/plain
Does your version of BPlite parse HTML formatted blast report?
> -----Original Message-----
> From: Ian Korf [SMTP:ikorf@sapiens.wustl.edu]
> Sent: Wednesday, October 11, 2000 8:40 AM
> To: Bioperl
> Subject: Re: [Bioperl-l] parsing only the summary part of a blast
> report
>
> The latest version of BPlite also reads concatenated blast reports. So you
> can do the following:
>
> my $multi = new BPlite::Multi(\*FILEHANDLE);
> while (my $blast = $multi->nextBlast) {
> while (my $sbjct = $blast->nextSbjct) {
> print "$sbjct\n";
> }
> }
>
> Unfortunately, BPlite has two development branches, the bioperl one and my
> own (sorry about that). But the new code should be trivial to migrate.
>
> -Ian
>
> On Wed, 11 Oct 2000, Jason Stajich wrote:
>
> > see the perldoc for Bio::Tools::BPlite for all the complete api.
> >
> > This script will parse and print all of the hits in your report.
> >
> > #!/usr/local/bin/perl -w
> >
> > use Bio::Tools::BPlite;
> >
> > my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
> > print $report->query(), ",", $report->database(), "\n";
> > while ( my $sbjct = $report->nextSbjct ) {
> > print "name is ", $sbjct->name(), "\n";
> > }
> >
> > Alternatively you can use Bio::Tools::Blast, but you will find it slower
> > and more memory intensive. It does support more features so it depends
> on
> > what your needs are.
> >
> > On Wed, 11 Oct 2000, Hilmar Lapp wrote:
> >
> > > > "Zhao, David [PRI]" wrote:
> > > >
> > > > Hi there,
> > > > How can I parse the summary part of a blast report using bioperl
> > > > modules? such as:
> > > >
> > >
> > > I'm almost sure you can't using Blast.pm. Maybe you can with BPlite
> > > (development trunk only). I know that Blast.pm takes a signifant time
> to
> > > parse long reports (i.e., with hundreds of alignments), but we
> haven't
> > > checked yet whether BPlite is significantly faster in such cases. I
> guess
> > > you're asking because you bother about the time lost in parsing the
> > > alignments, although you needed only the summary data which are
> already
> > > present in the hit list.
> > >
> > > BTW you can pass a significance threshold to Blast.pm, and although
> I'm
> > > not sure I think it won't parse those alignments beyond the
> significance
> > > threshold.
> >
> > [Regarding Bio::Tools::Blast]
> >
> > I'm pretty sure the signifigance threshold only applies to when you are
> > running Blast not parsing a report. In fact you should not put a
> > signif=>$value in your parameter hash if you are just parsing a report
> > file.
> >
> > >
> > > Hilmar
> > > --
> > > -----------------------------------------------------------------
> > > Hilmar Lapp email: hlapp@gmx.net
> > > NFI Vienna, IFD/Bioinformatics phone: +43 1 86634 631
> > > A-1235 Vienna fax: +43 1 86634 727
> > > -----------------------------------------------------------------
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-l
> > >
> >
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > http://galton.mc.duke.edu/~jason/
> > (919)684-1806 (office)
> > (919)684-2275 (fax)
> > Center for Human Genetics - Duke University Medical Center
> > http://wwwchg.mc.duke.edu/
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
------_=_NextPart_001_01C0346D.A9DA4EB0
Content-Type: text/html
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2651.75">
<TITLE>RE: [Bioperl-l] parsing only the summary part of a blast =
report</TITLE>
</HEAD>
<BODY>
<P><FONT COLOR=3D"#000080" FACE=3D"Comic Sans MS">Does your version of =
BPlite parse HTML formatted blast report?</FONT>
</P>
<P><FONT SIZE=3D1 FACE=3D"Arial">-----Original Message-----</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">From: </FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Ian Korf [SMTP:ikorf@sapiens.wustl.edu]</FONT>
<BR><B><FONT SIZE=3D1 FACE=3D"Arial">Sent: </FONT></B> <FONT =
SIZE=3D1 FACE=3D"Arial">Wednesday, October 11, 2000 8:40 AM</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">To: </FONT></B> <FONT SIZE=3D1 =
FACE=3D"Arial">Bioperl</FONT>
<BR><B><FONT SIZE=3D1 =
FACE=3D"Arial">Subject: </FONT>=
</B> <FONT SIZE=3D1 FACE=3D"Arial">Re: [Bioperl-l] parsing only the =
summary part of a blast report</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">The latest version of BPlite also =
reads concatenated blast reports. So you</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">can do the following:</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">my $multi =3D new =
BPlite::Multi(\*FILEHANDLE);</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">while (my $blast =3D =
$multi->nextBlast) {</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"> while (my $sbjct =
=3D $blast->nextSbjct) {</FONT>
<BR><FONT SIZE=3D2 =
FACE=3D"Arial"> print =
"$sbjct\n";</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"> }</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">}</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">Unfortunately, BPlite has two =
development branches, the bioperl one and my</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">own (sorry about that). But the new =
code should be trivial to migrate.</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">-Ian</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">On Wed, 11 Oct 2000, Jason Stajich =
wrote:</FONT>
</P>
<P><FONT SIZE=3D2 FACE=3D"Arial">> see the perldoc for =
Bio::Tools::BPlite for all the complete api.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> This script will parse and print =
all of the hits in your report.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> #!/usr/local/bin/perl -w</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> use Bio::Tools::BPlite;</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> my $report =3D new =
Bio::Tools::BPlite(-fh=3D>\*STDIN);</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> print $report->query(), =
",", $report->database(), "\n";</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> while ( my $sbjct =3D =
$report->nextSbjct ) {</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> print =
"name is ", $sbjct->name(), "\n";</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> }</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> Alternatively you can use =
Bio::Tools::Blast, but you will find it slower</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> and more memory intensive. =
It does support more features so it depends on</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> what your needs are.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> On Wed, 11 Oct 2000, Hilmar Lapp =
wrote:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > "Zhao, David =
[PRI]" wrote:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > Hi there,</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > How can I parse the =
summary part of a blast report using bioperl</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > modules? such =
as:</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > I'm almost sure you can't =
using Blast.pm. Maybe you can with BPlite</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > (development trunk only). I =
know that Blast.pm takes a signifant time to</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > parse long reports (i.e., =
with hundreds of alignments), but we haven't</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > checked yet whether BPlite =
is significantly faster in such cases. I guess</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > you're asking because you =
bother about the time lost in parsing the</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > alignments, although you =
needed only the summary data which are already</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > present in the hit =
list.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > BTW you can pass a =
significance threshold to Blast.pm, and although I'm</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > not sure I think it won't =
parse those alignments beyond the significance</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > threshold.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> [Regarding =
Bio::Tools::Blast]</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> I'm pretty sure the signifigance =
threshold only applies to when you are</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> running Blast not parsing a =
report. In fact you should not put a</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> signif=3D>$value in your =
parameter hash if you are just parsing a report</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> file.</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > =
Hilmar</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > -- </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > =
-----------------------------------------------------------------</FONT>=
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > Hilmar =
Lapp &n=
bsp; &n=
bsp; email: =
hlapp@gmx.net</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > NFI Vienna, =
IFD/Bioinformatics =
phone: +43 1 86634 631</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > A-1235 =
Vienna =
=
fax: +43 1 86634 =
727</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > =
-----------------------------------------------------------------</FONT>=
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > Bioperl-l mailing =
list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > =
Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > <A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> > </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> Jason Stajich</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> jason@chg.mc.duke.edu</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> <A =
HREF=3D"http://galton.mc.duke.edu/~jason/" =
TARGET=3D"_blank">http://galton.mc.duke.edu/~jason/</A></FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> (919)684-1806 (office) </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> (919)684-2275 (fax) </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> Center for Human Genetics - Duke =
University Medical Center</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> <A =
HREF=3D"http://wwwchg.mc.duke.edu/" =
TARGET=3D"_blank">http://wwwchg.mc.duke.edu/</A> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> <A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
<BR><FONT SIZE=3D2 FACE=3D"Arial">> </FONT>
</P>
<P><FONT SIZE=3D2 =
FACE=3D"Arial">_______________________________________________</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l mailing list</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial">Bioperl-l@bioperl.org</FONT>
<BR><FONT SIZE=3D2 FACE=3D"Arial"><A =
HREF=3D"http://bioperl.org/mailman/listinfo/bioperl-l" =
TARGET=3D"_blank">http://bioperl.org/mailman/listinfo/bioperl-l</A></FON=
T>
</P>
</BODY>
</HTML>
------_=_NextPart_001_01C0346D.A9DA4EB0--