[Bioperl-l] Bug in Bio::Tools::BPlite::HSP.pm ?

Jason Eric Stajich jason@cgt.mc.duke.edu
Wed, 5 Sep 2001 10:39:17 -0400 (EDT)


This is because your report has the following in the scoreline
Expect(2) = e-32
rather than
Expect = e-32

I have patched this on bioperl live (CVS head).
Patch is also below.  Thanks for reporting it.  If you cannot patch the
code yourself you can
a) do a cvs checkout of bioperl code and put this dir in your PERL5LIB
b) run sed 's/Expect(2)/Expect/' REPORTNAME | perl blastscript.pl

HTH
-jason

--begin
Index: Bio/Tools/BPlite/Sbjct.pm
===================================================================
RCS file:
/home/repository/bioperl/bioperl-live/Bio/Tools/BPlite/Sbjct.pm,v
retrieving revision 1.17
retrieving revision 1.18
diff -r1.17 -r1.18
1c1
< # $Id: Sbjct.pm,v 1.17 2001/08/25 12:11:11 lapp Exp $
---
> # $Id: Sbjct.pm,v 1.18 2001/09/05 14:43:12 jason Exp $
136,137c136
<   if (not defined $p) {($p) = $scoreline =~ /Expect =\s+(\S+)/}
<
---
>   if (not defined $p) {(undef, $p) = $scoreline =~ /Expect(\(\d+\))? =\s+(\S+)/}
--end

On Wed, 5 Sep 2001, Leonardo Marino-Ramirez wrote:

> Hello,
>
> I am trying to get e-values from blast reports using the
> Bio::Tools::BPlite module. So far I am able to get pretty much all the
> atributes from the HSP object.
>
> I am using a standard script to get e-values:
>
> #!/usr/bin/perl
>
> use Bio::Tools::BPlite;
>
> my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
>
> my $query = $report->query;
> @tmp = split ' ', $query; $qn = $tmp[0];
> my $database = $report->database;
>
>
> while(my $sbjct = $report->nextSbjct) {
>     my $blast_hit = $sbjct->name;
>     @tmp = split /\|/, $blast_hit; $gi = $tmp[1];
>     print "\ngi is $gi\nquery is $qn\n";
>     while(my $hsp = $sbjct->nextHSP) {
>         $sc = $hsp->bits; print "score is $sc\n";
>         $ev = $hsp->P; print "e-value is $ev\n";
>    }
> }
>
> The problem is that when I am reading a report that has
>
> 3 different e-values the only one that is parsed correctly is the third
> one.
>
> My output of the script above looks like this:
>
> gi is 1787829
> query is EBO9901A09.Seq
> score is 290
> e-value is 1
> score is 29.6
> e-value is 1
>
> gi is 1787636
> query is EBO9901A09.Seq
> score is 123
> e-value is 1
> score is 29.6
> e-value is 1
>
> gi is 1787403
> query is EBO9901A09.Seq
> score is 58.2
> e-value is 7e-10
>
>
> What is the problem?
>
> uname -a
> Linux tofu.tamu.edu 2.2.12-20smp #1 SMP Mon Sep 27 10:34:45 EDT 1999 i686
> unknow
> n
>
>
> Thanks, Leonardo
>

-- 
Jason Stajich
Duke University
jason.stajich@duke.edu