[Bioperl-l] new_fast methods

Heikki Lehvaslaiho heikki.lehvaslaiho at gmail.com
Fri Feb 27 07:34:26 UTC 2009


At some point I remember seeing optimizations where hashes are created
directly and blessed into appropriate classes at the very end.

Are these hacks nowadays discouraged?


   -Heikki


2009/2/26 Albert Vilella <avilella at gmail.com>:
> Yes, we've got some new_fasts sprinkled around in Ensembl. Never got
> to switch to arrays instead of hashes though.
>
> Should your slims be merged into the main branch at some point?
>
> On Thu, Feb 26, 2009 at 1:52 PM, Jason Stajich <jason at bioperl.org> wrote:
>> FYI - I wrote some lightweight feature objects - there is a branch for it
>> (lightweight_feature_branch) - these had a pretty significant speedup.
>>
>>  A lot of the overhead with sequence/feature/location creation since there
>> are so many objects being created, so optimizing these features by using
>> arrays instead of hashes for the data structure seemed to provide a pretty
>> significant speedup as well.  Ensembl uses a fast_new as well, right?
>> Bio::SeqFeature::Slim
>>
>> -jason
>> On Feb 26, 2009, at 4:28 AM, Albert Vilella wrote:
>>
>>> Hi,
>>>
>>> I would like to ask for comments to the list on the convenience of
>>> having "new_fast" methods in Bioperl.
>>> If one does some profiling on Bioperl scripts that parse large
>>> quantities of data, the "_rearrange" method stands out as a possible
>>> easy point of optimization. There are parts of the code that call the
>>> new method with explicit options. See for example:
>>>
>>> We should be able to create a "new_fast" method for this cases that
>>> takes the ordering as given and doesn't call "_rearrange". This
>>> wouldn't disrupt existing code that still calls "new".
>>>
>>> Comments?
>>>
>>> Bio/Seq/SeqWithQuality.pm
>>>
>>>  if (!$seq) {
>>>     my $id;
>>>     unless ($self->{supress_warnings} == 1) {
>>>        $self->warn("You did not provide sequence information during the ".
>>>          "construction of a Bio::Seq::SeqWithQuality object. Sequence ".
>>>          "components for this object will be empty.");
>>>     }
>>>     if (!$alphabet) {
>>>        $self->throw("If you want me to create a PrimarySeq object for your
>>> ".
>>>          "empty sequence <boggle> you must specify a -alphabet to satisfy
>>> ".
>>>          "the constructor requirements for a Bio::PrimarySeq object with
>>> no ".
>>>          "sequence. Read the POD for it, luke.");
>>>     }
>>>     $self->{seq_ref} = Bio::PrimarySeq->new( -seq              =>  "",
>>>                                              -accession_number =>  $acc,
>>>                                              -primary_id       =>  $pid,
>>>                                              -desc             =>  $desc,
>>>                                              -display_id       =>  $id,
>>>                                              -alphabet         =>
>>> $alphabet );
>>>  } elsif ($seq->isa('Bio::PrimarySeqI') || $seq->isa('Bio::SeqI')) {
>>>     $self->{seq_ref} = $seq;
>>>  } elsif (ref($seq)) {
>>>     $self->throw("You passed a seq argument into a SeqWithQUality object
>>> and".
>>>       " it was a reference ($seq) which did not inherit from Bio::SeqI or
>>> ".
>>>       "Bio::PrimarySeqI. I don't know what to do with this!");
>>>  } else {
>>>     my $seqobj = Bio::PrimarySeq->new( -seq              => $seq,
>>>                                        -accession_number => $acc,
>>>                                        -primary_id       => $pid,
>>>                                        -desc             => $desc,
>>>                                        -display_id       => $id   );
>>>     $self->{seq_ref} = $seqobj;
>>>  }
>>>  # Then import the quality scores
>>>  if (!defined($qual)) {
>>>     $self->{qual_ref} = Bio::Seq::PrimaryQual->new( -qual             =>
>>> "",
>>>                                                     -accession_number =>
>>> $acc,
>>>                                                     -primary_id       =>
>>> $pid,
>>>                                                     -desc
>>> => $desc,
>>>                                                     -display_id
>>> => $id, );
>>>  } elsif (ref($qual) eq "Bio::Seq::PrimaryQual") {
>>>     $self->{qual_ref} = $qual;
>>>  } else {
>>>     my $qualobj = Bio::Seq::PrimaryQual->new( -qual             => $qual,
>>>                                               -accession_number => $acc,
>>>                                               -primary_id       => $pid,
>>>                                               -desc             => $desc,
>>>                                               -display_id       => $id,
>>>                                               -trace_indices    =>
>>> $trace_indices );
>>>     $self->{qual_ref} = $qualobj;
>>>  }
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Jason Stajich
>> jason at bioperl.org
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
    -Heikki
Heikki Lehvaslaiho - heikki lehvaslaiho gmail com
Sent from: Cape Town Western Cape South Africa.




More information about the Bioperl-l mailing list