[Bioperl-l] new_fast methods

Jason Stajich jason at bioperl.org
Thu Feb 26 13:52:21 UTC 2009


FYI - I wrote some lightweight feature objects - there is a branch for  
it (lightweight_feature_branch) - these had a pretty significant  
speedup.

  A lot of the overhead with sequence/feature/location creation since  
there are so many objects being created, so optimizing these features  
by using arrays instead of hashes for the data structure seemed to  
provide a pretty significant speedup as well.  Ensembl uses a fast_new  
as well, right?
Bio::SeqFeature::Slim

-jason
On Feb 26, 2009, at 4:28 AM, Albert Vilella wrote:

> Hi,
>
> I would like to ask for comments to the list on the convenience of
> having "new_fast" methods in Bioperl.
> If one does some profiling on Bioperl scripts that parse large
> quantities of data, the "_rearrange" method stands out as a possible
> easy point of optimization. There are parts of the code that call the
> new method with explicit options. See for example:
>
> We should be able to create a "new_fast" method for this cases that
> takes the ordering as given and doesn't call "_rearrange". This
> wouldn't disrupt existing code that still calls "new".
>
> Comments?
>
> Bio/Seq/SeqWithQuality.pm
>
>   if (!$seq) {
>      my $id;
>      unless ($self->{supress_warnings} == 1) {
>         $self->warn("You did not provide sequence information during  
> the ".
>           "construction of a Bio::Seq::SeqWithQuality object.  
> Sequence ".
>           "components for this object will be empty.");
>      }
>      if (!$alphabet) {
>         $self->throw("If you want me to create a PrimarySeq object  
> for your ".
>           "empty sequence <boggle> you must specify a -alphabet to  
> satisfy ".
>           "the constructor requirements for a Bio::PrimarySeq object  
> with no ".
>           "sequence. Read the POD for it, luke.");
>      }
>      $self->{seq_ref} = Bio::PrimarySeq->new( -seq              =>   
> "",
>                                               -accession_number =>   
> $acc,
>                                               -primary_id       =>   
> $pid,
>                                               -desc             =>   
> $desc,
>                                               -display_id       =>   
> $id,
>                                               -alphabet         =>
> $alphabet );
>   } elsif ($seq->isa('Bio::PrimarySeqI') || $seq->isa('Bio::SeqI')) {
>      $self->{seq_ref} = $seq;
>   } elsif (ref($seq)) {
>      $self->throw("You passed a seq argument into a SeqWithQUality  
> object and".
>        " it was a reference ($seq) which did not inherit from  
> Bio::SeqI or ".
>        "Bio::PrimarySeqI. I don't know what to do with this!");
>   } else {
>      my $seqobj = Bio::PrimarySeq->new( -seq              => $seq,
>                                         -accession_number => $acc,
>                                         -primary_id       => $pid,
>                                         -desc             => $desc,
>                                         -display_id       => $id   );
>      $self->{seq_ref} = $seqobj;
>   }
>   # Then import the quality scores
>   if (!defined($qual)) {
>      $self->{qual_ref} = Bio::Seq::PrimaryQual->new( - 
> qual             => "",
>                                                      - 
> accession_number => $acc,
>                                                      - 
> primary_id       => $pid,
>                                                      -desc
> => $desc,
>                                                      -display_id
> => $id, );
>   } elsif (ref($qual) eq "Bio::Seq::PrimaryQual") {
>      $self->{qual_ref} = $qual;
>   } else {
>      my $qualobj = Bio::Seq::PrimaryQual->new( -qual             =>  
> $qual,
>                                                -accession_number =>  
> $acc,
>                                                -primary_id       =>  
> $pid,
>                                                -desc             =>  
> $desc,
>                                                -display_id       =>  
> $id,
>                                                -trace_indices    =>
> $trace_indices );
>      $self->{qual_ref} = $qualobj;
>   }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason at bioperl.org






More information about the Bioperl-l mailing list