[Bioperl-l] SearchIO speed up

Sendu Bala bix at sendu.me.uk
Thu Aug 10 19:11:24 UTC 2006


Chris Fields wrote:
> 
> On Aug 10, 2006, at 11:04 AM, Sendu Bala wrote:
> 
>>> Just curious, but is there a possibility of making "lazy" 
>>> instantiation of
>>> objects like HSP and HIT objects?  Things like parsing and output 
>>> could be
>>> accomplished without these objects?
>>
>> That's what I've done actually, which is why performance varies between 
>> 5x and 1.5x (lower performance when the instantiation is forced).
>>
>> But, things like 'parsing and output' do need to force the instantiation 
>> unless, say, an output module knew about the hash structure of the thing 
>> stored inside a Result object. Which is too horrible a situation to 
>> comprehend. :O
>>
>> Or is it? What specifically did you have in mind?
> 
> The nice thing about SearchIO is the ability to attach a Handler to 
> return specific objects.  For instance, if you didn't want HSP's then 
> they could be 'junked' by using SearchIO::FastResultEventBuilder, which 
> just returns hits.  I don't know how the other SearchIO modules (hmmer, 
> etc) deal with this though, but it works for blast and (I think) blastxml.
> 
> You might use this same strategy have the handler return simple hashes 
> instead of objects,

Yes, the main change I have made that provides the speed increase is to 
make the handler (SearchResultEventBuilder) return hashes instead of 
objects.

It's a transparent change when combined with the lazy instantiation.


> Alternatively, create a new SearchIO class (call it fastblast; okay, 
> terrible name) that doesn't use a handler and just returns hashes.  I 
> think Jason pointed out previously that the handler isn't required.

But I didn't see any particular harm in keeping them. Not having a 
handler might shave a percent or two off run times, but you need to 
balance speed with power and flexibility. I don't know where that 
balance lies, hence my question to the community.



More information about the Bioperl-l mailing list