[Bioperl-l] get_Stream_by_gi: Memory going up every call.
snaphit at planet.nl
snaphit at planet.nl
Tue Oct 30 14:56:10 UTC 2007
I just made a test script which shows the problem.
The second while loop will cause the program to use more and more memory without releasing it.
I will post the bug later today.
Jelle
code:
use Bio::DB::GenBank;
use strict;
use warnings;
my @arraylist = (157043286,157043285,157043189); #use couple of hundreds gi's to see the issue
while (my @small_list = splice(@arraylist, 0, 100)) {
my $gb = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
my $stream_obj = $gb->get_Stream_by_gi(\@small_list);
while (my $seq_obj = $stream_obj->next_seq) {
#this is what causes the problem...
}
}
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Tue 10/30/2007 1:10 PM
To: snaphit at planet.nl
Cc: Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
What happens if you create a new Bio::DB::GenBank instance each time
within streamQuery() instead of using a cached instance? Does the
memory issue go away?
I'll try to look into it when I can; if you could file a bug on this
it would help track it:
http://www.bioperl.org/wiki/Bugs
chris
On Oct 30, 2007, at 3:51 AM, <snaphit at planet.nl> wrote:
> This doesn't seem to work. It still keeps using more memory. I
> already tried it once and didn't seem to make a difference. But
> gave it another try. But as I said, it doesn't solve the problem.
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Mon 10/29/2007 5:06 PM
> To: snaphit at planet.nl
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
>
> It may be based on the mode (set by -retrievaltype) in which the
> sequences are being retrieved and parsed. The Bio::DB::WebDBSeqI
> module has the documentation for this parameter. If you are making
> tons of calls to get_Seq*/get_Stream* methods it may lead to
> substantial increases in memory until the child processes finish up
> parsing each data stream.
>
> You can possibly add in a wait() in between sequence retrieval calls,
> or try setting the Bio::DB::GenBank instance to 'tempfile' or
> 'io_string' (the former always worked faster for me):
>
> $self->{gb} = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
>
> chris
>
> On Oct 26, 2007, at 7:42 AM, Jelle86 wrote:
>
> > Ok I stripped a lot. And this is causing the problem:
> >
> > use Bio::DB::Genbank;
> > sub new(){
> > my $invocant = shift;
> > my $class = ref($invocant) || $invocant;
> > my $self = {@_};
> > $self->{gb} = Bio::DB::GenBank->new();
> > bless $self, $class;
> > return $self;
> > }
> >
> > sub streamQuery(){
> > my $self = shift;
> > my $stream_obj = $self->{gb}->get_Stream_by_gi($self->{ids});
> > while (my $seq_obj = $stream_obj->next_seq) {
> >
> > }
> > }
> >
> > Both subs (new and streamQuery) are called several times with a new
> > accessionlist.
> > Removing the while loop, will use a bit less memory. But the memory
> > usage is
> > still going up.
> > --
> > View this message in context: http://www.nabble.com/get_Stream_by_gi
> > %3A-Memory-going-up-every-call.-tf4689188.html#a13426480
> > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list