[Bioperl-l] question about Bio::Restriction::Analysis
Steve Chervitz
sac at bioperl.org
Mon Jun 4 17:56:57 UTC 2007
Hi Apurva,
I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.
But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.
Good luck,
Steve
On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
> $x++;
> my $id = $sequence_obj->id();
>
> print STDERR "$x Working on $id\n";
>
> # generate the rx object
> my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
> my @frags = $ra->cut_seq($sequence_obj);
>
> my $counter = 0;
> foreach my $frag (@frags){
> $counter++;
> my $length = length ($frag);
> print ">$id.$counter length=$length\n$frag\n";
> }
>
> }
>
>
More information about the Bioperl-l
mailing list