[Bioperl-l] Parse BLAST report using BPlite

Jason Stajich jason@cgt.mc.duke.edu
Thu, 24 Jan 2002 15:06:48 -0500 (EST)


There is documentation for all the methods in BPlite have you taken a look
at that before looking at the tutorial. This is provided as POD (i.e.
% perldoc Bio::Tools::BPlite
% perldoc Bio::Tools::BPlite::Sbjct
% perldoc Bio::Tools::BPlite::HSP


The Pdoc version of them is at http://docs.bioperl.org/bioperl-live which
is prettier than plain POD IMHO (note this documentation is for the live
version of the code and not the 0.7.2 release so some other modules API
may be different from what you have installed).

I agree the example code is spotty - we have put out a call for other
users to help us bring our documentation and tutorials up to speed.  Just
that the main developers only have so much time so sometimes it is
important for other people to step up and help out.  We'd be happy to
provide you with an account if you end up keeping a running list of
problems and solutions and would like to contribute them to the project.

As for your question. Scores are associated with the HSPs not the subjects
in the BPlite model.  So you might do the following if I understand you
request.  Note that the Subjects are going to be in the order of the
report and those are typically in order of the BEST overall hit taking
into account the HSPs for a hit so I'm not sure why you think the 2 best
hits are not the first 2.

my @hits;
while( my $hit = $report->nextSbcjt() ) {

  my @hsps;
  my $best;

  while( my $hsp = $hit->nextHSP() ) {
   push @hsps, $hsp;
   # you could sub bit score for evalue if you were interested in that
   $best = $hsp->evalue unless defined $best || $best > $hsp->evalue;
  }
 push @hits, [ $best, $hit];
}

foreach my $hit ( sort { $a->[0] <=> $b->[0] } @hits ) {
 # process hits in best order
}

Hopefully I understood your question well enough here?

HTH
jason
On Thu, 24 Jan 2002, Simon Chan wrote:

> Hi All,
>
> Yeah, it's me again :)
>
> Right now, I'm looking at exercise 2.6 of the Pasteur Tutorial.
> Using the BPlite parser, I am trying to parse the BLAST report I
> generated with blastall() to return the 2 best.  By default, BLAST
> lists the highest scoring hits first, however, the 2 best hits may not
> be the first 2 returned.  How do I do this (return the best 2 hits)?
> Here's what I've got:
>
>
> while (my $hit = $report->nextSbjct()) {
>    print "Name: ", $hit->name, "\n";
>
>    while (my $hsp = $hit->nextHSP()) {
>
>    print "Score ". $hsp->bits . "  Start " . $hsp->subject->start . "  End " .  $hsp->subject->end .   "\n";
>
>
>    }
>    Finally, I appreicate all the help everyone out there has given me,
> but are there other bio-perl docs out there that step-by-step explain
> the lines of code?  I'm not new to Perl, however I am new to
> object-oriented programming.  But I know the basics of oo, and
> shouldn't that be enough to make sense of the code in the Pasteur
> tutorial?!  Going a little crazy here. ;-(
>
>
> Once again, many thanks, All.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu