[Bioperl-l] question about Bio::Restriction::Analysis

Mon Jun 4 17:56:57 UTC 2007

Hi Apurva,

I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.

But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.

Good luck,
Steve

On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta  = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
>      $x++;
>      my $id = $sequence_obj->id();
>
>      print STDERR "$x Working on $id\n";
>
>      # generate the rx object
>      my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
>      my @frags = $ra->cut_seq($sequence_obj);
>
>      my $counter = 0;
>      foreach my $frag (@frags){
>          $counter++;
>          my $length = length ($frag);
>          print ">$id.$counter length=$length\n$frag\n";
>      }
>
> }
>
>