[Bioperl-l] BPlite: how to parse a file containing multiple blast results?

Jason Eric Stajich jason@cgt.mc.duke.edu
Thu, 13 Sep 2001 21:52:20 -0400 (EDT)


Kind of a hack, but this should work on a datafile produced by blasting a
fasta database file and thus have multiple concatenated blast reports.
It works for my local hacked copy which I am still testing out my
changes on before commiting.

open(FH, "<blastreport.bls") or die("cannot open blastreport.bls");

REPORT: until( eof(FH) ) {
	my $report = new Bio::Tools::BPlite(-fh => \*FH);
	my $query = $report->query;
      	next REPORT if( ! $query) # handle some weird cases

	# your SBJCT / HSP  handing code here
}

You can also try and use my newly committed Bio::Index::Blast (might be
renamed to avoid confusion that it is reading blast index files...)
if you are working off CVS head which will allow access to a specific
blast report indexed on query name (which you could build from a
open(SEQIDS, "grep '^>\S+' fastadb.fa |");

----

As for your running out of tempspace, you can try destroying the
BPlite/StandAloneBlast object on every iteration (inefficient) or looking
under the hood and calling the cleanup functions. Explicitly destroying
created BPlite reports as soon as they are no longer needed rather than
waiting for them to go out of scope might also help.  I'm not really sure
where the pileup is occurring as I've not run into this problem myself.

No one has tried to implement a more efficient way to manage tempspace
when creating tempfiles other than to cleanup on object destruction, if
someone wants to propose or implement something they are certainly
welcomed.

I will at least add to my internal 1.0 wishlist that some work should be
done on this for the next release.

-jason
On Thu, 13 Sep 2001, Kun Zhang wrote:

> I did a blast search with a fasta file containing multiple sequence. In
> order to parse the blast result, I was trying the private function
> _parseHeader as suggested in the perldoc, but couldn't figure it out how to
> make it work. Any suggestion? Thanks!
>
> Kun Zhang
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu