[Bioperl-l] SearchIO speed up
Sendu Bala
bix at sendu.me.uk
Thu Aug 10 19:11:24 UTC 2006
Chris Fields wrote:
>
> On Aug 10, 2006, at 11:04 AM, Sendu Bala wrote:
>
>>> Just curious, but is there a possibility of making "lazy"
>>> instantiation of
>>> objects like HSP and HIT objects? Things like parsing and output
>>> could be
>>> accomplished without these objects?
>>
>> That's what I've done actually, which is why performance varies between
>> 5x and 1.5x (lower performance when the instantiation is forced).
>>
>> But, things like 'parsing and output' do need to force the instantiation
>> unless, say, an output module knew about the hash structure of the thing
>> stored inside a Result object. Which is too horrible a situation to
>> comprehend. :O
>>
>> Or is it? What specifically did you have in mind?
>
> The nice thing about SearchIO is the ability to attach a Handler to
> return specific objects. For instance, if you didn't want HSP's then
> they could be 'junked' by using SearchIO::FastResultEventBuilder, which
> just returns hits. I don't know how the other SearchIO modules (hmmer,
> etc) deal with this though, but it works for blast and (I think) blastxml.
>
> You might use this same strategy have the handler return simple hashes
> instead of objects,
Yes, the main change I have made that provides the speed increase is to
make the handler (SearchResultEventBuilder) return hashes instead of
objects.
It's a transparent change when combined with the lazy instantiation.
> Alternatively, create a new SearchIO class (call it fastblast; okay,
> terrible name) that doesn't use a handler and just returns hashes. I
> think Jason pointed out previously that the handler isn't required.
But I didn't see any particular harm in keeping them. Not having a
handler might shave a percent or two off run times, but you need to
balance speed with power and flexibility. I don't know where that
balance lies, hence my question to the community.
More information about the Bioperl-l
mailing list