[Bioperl-l] SearchIO speed up
Hilmar Lapp
hlapp at gmx.net
Thu Aug 17 22:15:32 UTC 2006
I wouldn't do any of this. It is at best unexpected code with
expected behavior, and from there it only gets worse. I don't see why
the loss of standard constructor implementation and behavior is worth
a speed-up of less than several fold.
I can't imagine that any of the Bioperl-associated speed problems
will go away by speeding it up 1.4-fold.
I appreciate your work and I appreciate you caring about bioperl
being as useful as possible by meeting people's requirements of
speedy performance. I also do think though that areas holding the
best cost to benefit ratio are going to be those places where speed -
ups of 5x or 10x or more can be achieved without drastic API or
inheritance changes.
I would not be surprised if there isn't a great number of such places
all over Bioperl; this is not the first time people tried to (and
succeeded to) improve speed. Drastic approaches I think will not work
though if applied piece-meal; rather, drastic speed improvements are
likely to require drastic architecture changes, which I believe
cannot be done really well if they are always constrained by
backwards compatibility.
I.e., if you talk about drastic architecture changes you are no
longer talking about Bioperl as we know it ("1.x"). A few years ago
several bright young people wanted to get together to build what at
the time was dubbed Bioperl 2.0 ... Unfortunately, we all get older
and we move on in our lives ... As a result those people are now
scattered and I doubt will ever take this on. I.e., Bioperl 2.0 will
need a new crop to pick up the challenge.
-hilmar
On Aug 17, 2006, at 5:53 PM, Sendu Bala wrote:
> Chris Fields wrote:
>> I have to agree. If there was a way to get around this by having
>> the change
>> behind the scenes in HSPI then I wouldn't see a problem.
>>
>> Hence my suggestion of implementing hit() and other
>> SeqFeature::SimilarityPair methods directly in
>> Bio::Search::HSP::HSPI (i.e.
>> no SimilarityPair inheritance) to return
>> Bio::SeqFeature::Similarity objects
>> directly.
>
> That is exactly what I did (on your suggestion). The problem that
> Hilmar
> points out is that HSPI should continue being a SimilarityPair in case
> anything checks that it is a SimilarityPair.
>
> Would there be any problem with leaving HSPI as a SimilarityPair and
> having GenericHSP::new as:
>
> sub new {
> my($class, at args) = @_;
> my $self = $class->Bio::Root::Root::new(@args);
> #...
> # one change I forgot to mention before: the Similarity objects
> # created for query() and hit() can no longer have
> # '-primary' => $self->primary_tag set unless we also override
> # primary_tag, but I've no idea what primary_tag is supposed
> to do
> }
>
> # overridden methods, as before
>
> This gives a 1.43x speedup. (Simply overriding methods gives only a
> 1.14x speedup.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list