[Bioperl-l] SearchIO speed up

Sendu Bala bix at sendu.me.uk
Mon Aug 14 14:41:54 UTC 2006


Chris Fields wrote:
> On Aug 14, 2006, at 8:04 AM, Sendu Bala wrote:
> 
>> What I do at the moment, then, is on detecting piped input, I'm forced
>> to read all the input data in in one go and spit it out into seekable
>> memory or a temp file. After which normal behaviour resumes - you  
>> don't read everything, just the bit you want.
> 
> The traditional route has been using a tempfile.  Bio::Root::IO has  
> several methods for creating tempdirs/tempfiles.
> 
> I would have the option available for a tempfile, at least, for the  
> guys who deal with large BLAST files.  I think the XML files can also  
> be quite long.

Yes, as I stated, you have the option of creating a tempfile (and I use 
Bio::Root::IO to do it). My question was can we avoid the need for doing 
any such thing for piped data whilst still retaining all the advantages 
of a pull-parser (speed, low memory)?

I appreciate it's all very hard to imagine what on earth I'm trying to 
say; perhaps the discussion is better left until I make some code available.


> Speaking of XML, is the current idea to get this running on text- 
> based BLAST initially?

I'm using it first for my hmmpfam parser, then I'll try it for text 
blastn as proof-of-concept and move on from there. blast.pm is a bit of 
a nightmare to move over to a new system; that's another thing the 
pull-parser will solve - make code more manageable.



More information about the Bioperl-l mailing list