[Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage

Nathan S. Haigh n.haigh at sheffield.ac.uk
Sun Oct 29 17:43:20 UTC 2006


Sorry for the repeat post but I haven't had a response. Just wondered if 
anyone had any idea about this?

Thanks
Nath

Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012,  26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
>   print "Getting GI: $gi\n";
>   push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
>   my $tot_size = 0;
>   print "Processing ", $seq->primary_id,"\n";
>   my $ra = Bio::Restriction::Analysis->new(
>                                          -seq=>$seq,
>                                          -enzymes=>$enz_Coll,
>   );
>  
>   my @all_enzymes = $ra->cutters->each_enzyme;
>   print "  Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
>   foreach my $enzyme ( @all_enzymes ) {
>     # fragments() is a real memory hog
>     foreach my $frag ($ra->fragments($enzyme)) {
>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>       $tot_size += length $frag;
>     }
>     # do something based on value of $tot_size
>     #print "    ", $enzyme->name, " total = $tot_size\n";
>   }
>   print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   



More information about the Bioperl-l mailing list