[Bioperl-l] blast parsing for multiple blast output in one file

Jason Stajich jason@cgt.mc.duke.edu
Fri, 22 Feb 2002 13:23:35 -0500 (EST)


use Bio::Tools::BPlite or Bio::SearchIO.

to use Bio::Tools::BPlite with multiple reports do:
        use Bio::Tools::BPlite;
        my $report = new Bio::Tools::BPlite(-file=>$filename);

         {
           $report->query;
           $report->database;
           while(my $sbjct = $report->nextSbjct) {
               $sbjct->name;
               while (my $hsp = $sbjct->nextHSP) {
                   $hsp->score;
                   $hsp->bits;
                   $hsp->percent;
                   $hsp->P;
                   $hsp->match;
                   $hsp->positive;
                   $hsp->length;
                   $hsp->querySeq;
                   $hsp->sbjctSeq;
                   $hsp->homologySeq;
                   $hsp->query->start;
                   $hsp->query->end;
                   $hsp->hit->start;
                   $hsp->hit->end;
                   $hsp->hit->seqname;
                   $hsp->hit->overlaps($exon);
               }
           }

           # the following line takes you to the next report in the
stream/file
           # it will return 0 if that report is empty,
           # but that is valid for an empty blast report.
           # Returns -1 for EOF.

           last if ($report->_parseHeader == -1);
           redo;
         }

to use Bio::SearchIO with multiple reports do:
use Bio::SearchIO;

my $stream = new Bio::SearchIO(-format => 'blast',
			       -file   => $filename);
 # iterate through all the reports in a single file
while( my $result = $stream->next_result ) {
 # iterate through all the hits in a result
 while( my $hit = $result->next_hit ) {
  # see Bio::Search::Hit::HitI for available methods
  # iterate through all the hsps for a hit
  while( my $hsp = $hit->next_hsp ) {
   # see Bio::Search::HSP::HSPI for available methods
  }
 }
}

-jason
On Fri, 22 Feb 2002, Yuandan Zhang wrote:

> Hi,
>
> I have a multiple blast output stored in one file. This file was generated by a NCBI blast batch run. I tried to parse it using Bio::Tools::Blast. However, this module parses one blast result from one file or multiple blast results from multiple files. I am reluctant to split the multiple blast results into a number of files, each file contains one blast output, because this will generate a few thousands of files. Any advice on parsing multiple blast output stored in one file?
>
> Thanks,
>
> Yuandan
>
> --
> Yuandan Zhang, Ph.D.
> Animal Science, Iowa State University
> 2255 Kildee Hall, Ames IA 50011-3150 USA
> E-mail: ydzhang@iastate.edu
> Phone:  (515)  294 6114 (office)
> Fax:    (515)  294 2401
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   System Support for:
>   ANGENMAP Maillist         angenmap@db.genome.iastate.edu
>   U.S. Pig Genome Project   http://www.genome.iastate.edu/
>   Pig EST Project           http://pigest.genome.iastate.edu
>
>          .***.   .***.           .***.   .***.           .***.
>        * | | | * | | | *       * | | | * | | | *       * | | |
>        * | | | *   * | | | *   * | | | *   * | | | *   * | | | *
>      * | | | *       * | | | * | | | *       * | | | * | | | *
>        '***'           '***'   '***'           '***'   '***'
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu