Bioperl: Memory leak in BLAST modules
Steve Chervitz
sac@neomorphic.com (Steve A. Chervitz)
Thu, 29 Oct 1998 05:05:57 -0800 (PST)
I did some more tinkering, adding a new function to
Bio::Root::Object.pm that explicitly attempts to break cyclic object
reference structures. However, the Perl GC still refuses to free the
object's data until the global destruction at the end of the script. I
tested it out with some simple objects, so I know it's not something
funny with the Blast objects.
What seems to cause problems is very large Blast reports with many
significant hits. You can process gobs of reports with small to moderate
numbers of hits with out much trouble, but as soon as you hit a
massive report, memory usage takes a jump and stays there.
I'd really like to be able to monitor memory usage better and see
exactly what's tying the GC up.
Steve
Steve Chervitz writes:
> Lincoln,
>
> Yes, there are some cycles but they are *supposed* to be cleaned up
> during destruction. I'm not sure I've taken the best approach
> with the destructors for the relevant objects, however.
>
> The cycles in the object refs to be aware of:
>
> * Bio::Tools::Blast.pm contains refs for Bio::Tools::Blast::Sbjct.pm objects.
> * Bio::Tools::Blast::Sbjct.pm contains refs for Bio::Tools::Blast::HSP.pm
> objects and a ref to the parent Blast.pm object.
> * Bio::Tools::Blast::HSP.pm contains a ref to the parent Sbjct.pm object.
>
> The key methods to look into are the DESTROY() methods of:
>
> * Blast.pm, Sbjct.pm, HSP.pm, and the objects they inherit from:
> * Bio::Tools::SeqAnal.pm (superclass of Blast.pm)
> * Bio::Root::Object.pm (superclass of SeqAnal.pm, Sbjct.pm, and HSP.pm)
> This object manages the reference to the reference to the
> parent object ('_parent') for all objects that know their parents.
>
> I've tinkered around with this issue quite a bit. Any advice you could
> offer would be greatly appreciated.
>
> Steve
>
> Lincoln Stein writes:
> > Unfortunately it's still leaking, although maybe not quite so fast. I
> > don't think I'm going to be able to complete my analysis task at this
> > rate.
> >
> > Are there any cycles in the object references that Blast.pm generates?
> >
> > Lincoln
> >
> > Steve Chervitz writes:
> > > Lincoln,
> > >
> > > I've been grappling with the memory leak for a while and have a new
> > > version of Blast.pm that includes a fix for memory leaks when parsing
> > > streams of reports. This is version 0.063 which you can get from:
> > >
> > > http://genome-www.stanford.edu/perlOOP/bioperl/lib/Bio/Tools/Blast.pm
> > >
> > > Memory use is still an issue, particularly if you are parsing huge
> > > reports (on the order of 1 Mb or more) and not imposing significance
> > > criteria. If you can include even a minimal cutoff (i.e., -signif =>
> > > 0.01), that should improve memory usage compared to using no cutoff.
> > >
> > > Also, when creating individual Blast objects, it's a good idea to
> > > explicitly destroy them when you're done processing each one
> > > ($blast_obj->destroy). Using a single $blast_obj variable and
> > > re-assigning it should accomplish the same thing (but calling
> > > destroy() makes me feel better ;).
> > >
> > > If you (or others) have additional ideas for how to improve memory
> > > usage, I'd be happy to hear them. Do you know of a good memory
> > > management tool for Perl that can show reference counts etc. during
> > > the course of an execution?
> > >
> > > I haven't bundled the new version of Blast.pm into a distribution
> > > package yet since there are other changes I want to include. But look
> > > for it soon.
> > >
> > > Steve
> > > ___________________________________________________
> > > Steve A. Chervitz Neomorphic Software
> > > sac@neomorphic.com 2612b 8th Street
> > > http://www.neomorphic.com Berkeley, CA 94710
> > >
> > >
> > > Lincoln Stein writes:
> > > > Does anyone know of a memory leak in the Bio::Tools::Blast module?
> > > > I've got a script that creates and destroys several thousands of these
> > > > objects and it is definitely leaking. I don't want spend time
> > > > debugging the thing if there's a known problem.
> > > >
> > > > Lincoln
> > > >
> > > > --
> > > > ========================================================================
> > > > Lincoln D. Stein Cold Spring Harbor Laboratory
> > > > lstein@cshl.org Cold Spring Harbor, NY
> > > > ========================================================================
> > > > =========== Bioperl Project Mailing List Message Footer =======
> > > > Project URL: http://bio.perl.org/
> > > > For info about how to (un)subscribe, where messages are archived, etc:
> > > > http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> > > > ====================================================================
> > > >
> > > =========== Bioperl Project Mailing List Message Footer =======
> > > Project URL: http://bio.perl.org/
> > > For info about how to (un)subscribe, where messages are archived, etc:
> > > http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> > > ====================================================================
> > --
> > ========================================================================
> > Lincoln D. Stein Cold Spring Harbor Laboratory
> > lstein@cshl.org Cold Spring Harbor, NY
> > ========================================================================
> >
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================