[Bioperl-l] SearchIO speed up

Chris Fields cjfields at uiuc.edu
Fri Aug 18 13:58:21 UTC 2006



> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Friday, August 18, 2006 8:11 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] SearchIO speed up
> 
> Chris Fields wrote:
> > On Aug 18, 2006, at 2:00 AM, Sendu Bala wrote:
> >
> >>> So far, sorry to say, it's debatable whether a 1.5-fold increase
> >>> in speed along with even small API changes is worth all the
> >>> effort you are putting into it.
>  >
> >> To be fair, no API change is required, and it only took a few
> >> minutes to implement and try the idea out :)
> >
> > Maybe I'm missing something here; didn't you say it failed tests
> > somewhere?  That's suggestive of API problems.
> 
> The alternate suggestion using
> my $self = $class->Bio::Root::Root::new(@args);
> doesn't cause any test failures because it doesn't involve any API
> change, only a harmless implementation change. Hilmar wasn't happy with
> that because of 'loss of standard constructor implementation and
> behavior'. To be honest, the current implementation of GenericHSP and
> SimilarityPair&ancestors is a bit of a messy kludge with lots of wasted
> work, which is why we get the speed up in the first place by going
> straight to Root.

Okay.  I understand his objection to it (maintain constructor behavior by
chaining them), but I also see your reasoning.  This may be one of those
points where code obfuscation isn't worth the small increase in speed.  I'm
Switzerland on this point (neutral). 

> My PullParser modules solve the problem in a much better way (read:
> Hilmar would have no objections), but I was hoping for something that
> would work for all existing SearchIO modules as well.

I don't think you'll get that.  I agree with Hilmar there (that drastic
changes in speed would be necessary to warrant API changes).  If you can
demonstrate 'drastic changes' in speed, even with API changes, then it may
be feasible to introduce them alongside the current implementation and let
the user decide.  The end user can make the decision on whether to use the
older slower modules or the faster ones.  

Remember, these are all committed to an experimental branch so changes don't
pollute the main trunk (and the original SearchIO is largely intact, with no
API changes).  Hence these are available to anyone who wishes to test them
out.  You can always add demo scripts in the SYNPOSIS if there are API
issues (such as if you return hashes, for instance).  

> It doesn't matter in the end: it was easy to suggest it, and just as
> easy to not use it if people are unhappy with it.

I don't have problems with changes, even small API changes.  But we have to
deal with the long-term repercussions of making such changes via broken
scripts, bug reports, etc etc.  Small incremental speed increases may not be
worth the extra headache of having to deal with the onslaught of ticked-off
users.

cjf




More information about the Bioperl-l mailing list