[Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
Nathan S. Haigh
n.haigh at sheffield.ac.uk
Sun Oct 29 17:43:20 UTC 2006
Sorry for the repeat post but I haven't had a response. Just wondered if
anyone had any idea about this?
Thanks
Nath
Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012, 26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
> print "Getting GI: $gi\n";
> push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
> my $tot_size = 0;
> print "Processing ", $seq->primary_id,"\n";
> my $ra = Bio::Restriction::Analysis->new(
> -seq=>$seq,
> -enzymes=>$enz_Coll,
> );
>
> my @all_enzymes = $ra->cutters->each_enzyme;
> print " Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
> foreach my $enzyme ( @all_enzymes ) {
> # fragments() is a real memory hog
> foreach my $frag ($ra->fragments($enzyme)) {
> next if $min_fragment_size && (length $frag < $min_fragment_size);
> next if $max_fragment_size && (length $frag > $max_fragment_size);
> $tot_size += length $frag;
> }
> # do something based on value of $tot_size
> #print " ", $enzyme->name, " total = $tot_size\n";
> }
> print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list