[Bioperl-l] SearchIO speed up

Thu Aug 17 22:15:32 UTC 2006

I wouldn't do any of this. It is at best unexpected code with  
expected behavior, and from there it only gets worse. I don't see why  
the loss of standard constructor implementation and behavior is worth  
a speed-up of less than several fold.

I can't imagine that any of the Bioperl-associated speed problems  
will go away by speeding it up 1.4-fold.

I appreciate your work and I appreciate you caring about bioperl  
being as useful as possible by meeting people's requirements of  
speedy performance. I also do think though that areas holding the  
best cost to benefit ratio are going to be those places where speed - 
ups of 5x or 10x or more can be achieved without drastic API or  
inheritance changes.

I would not be surprised if there isn't a great number of such places  
all over Bioperl; this is not the first time people tried to (and  
succeeded to) improve speed. Drastic approaches I think will not work  
though if applied piece-meal; rather, drastic speed improvements are  
likely to require drastic architecture changes, which I believe  
cannot be done really well if they are always constrained by  
backwards compatibility.

I.e., if you talk about drastic architecture changes you are no  
longer talking about Bioperl as we know it ("1.x"). A few years ago  
several bright young people wanted to get together to build what at  
the time was dubbed Bioperl 2.0 ... Unfortunately, we all get older  
and we move on in our lives ... As a result those people are now  
scattered and I doubt will ever take this on. I.e., Bioperl 2.0 will  
need a new crop to pick up the challenge.

	-hilmar

On Aug 17, 2006, at 5:53 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have to agree.  If there was a way to get around this by having  
>> the change
>> behind the scenes in HSPI then I wouldn't see a problem.
>>
>> Hence my suggestion of implementing hit() and other
>> SeqFeature::SimilarityPair methods directly in  
>> Bio::Search::HSP::HSPI (i.e.
>> no SimilarityPair inheritance) to return  
>> Bio::SeqFeature::Similarity objects
>> directly.
>
> That is exactly what I did (on your suggestion). The problem that  
> Hilmar
> points out is that HSPI should continue being a SimilarityPair in case
> anything checks that it is a SimilarityPair.
>
> Would there be any problem with leaving HSPI as a SimilarityPair and
> having GenericHSP::new as:
>
> sub new {
>      my($class, at args) = @_;
>      my $self = $class->Bio::Root::Root::new(@args);
>      #...
>      # one change I forgot to mention before: the Similarity objects
>      # created for query() and hit() can no longer have
>      # '-primary'   => $self->primary_tag set unless we also override
>      # primary_tag, but I've no idea what primary_tag is supposed  
> to do
> }
>
> # overridden methods, as before
>
> This gives a 1.43x speedup. (Simply overriding methods gives only a
> 1.14x speedup.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================