[Bioperl-l] parsing tblastn using BPlite

Christoffels Alan Christoffels Alan <calan@mcbsgs1.imcb.nus.edu.sg>
Wed, 24 Oct 2001 12:19:10 +0800 (PST)


Hi,

I am using BPlite to parse a tblastn output file containing multiple
tblastn reports (~30K query sequences).

WHen I count the actually reports being read by the script, I only get
about 16K. Can any tell me what I am doing wrong in this script?

use Bio::Tools::BPlite;
use strict;
my $file = $ARGV[0];
my $cutoff = $ARGV[1];
my $total_rec = 0;
open(BLAST, "$file");
{
my $report = new Bio::Tools::BPlite(-fh=>\*BLAST);
$total_rec++;
my $QUERY = $report->query;
my $rec = 0;
GG:
while (my $sbjct = $report->nextSbjct) {
	$rec++;
	my $SUBJ = $sbjct->name;
	while (my $hsp = $sbjct->nextHSP) {
		my $HSP = $hsp->P;
		if ($HSP < $cutoff) {
		print "$QUERY\t$SUBJ\t$HSP\n";
		last;
		}
	}
	if ($rec ==1) {
		last GG;
	}
}
last if ($report->_parseHeader == -1);
redo;
}
close(BLAST);
print STDERR "total records .$total_rec.\n";

Alan Christoffels
Institute of Molecular and Cell Biology
30 Medical Drive
Singapore
117609
Tel: 65-8741489