[Bioperl-l] A better fix for Species.pm memory leak, please REVIEW.
Chris Fields
cjfields at illinois.edu
Tue Nov 25 16:06:57 UTC 2008
George (and anyone else paying attention),
I have attached a Bio::Species patch to bug 2594:
http://bugzilla.open-bio.org/show_bug.cgi?id=2594
I would like comments back on that patch as soon as anyone can test it
out. If there aren't any objections I'll probably commit it in a
week's time and close the bug out.
For the record, I don't see memory issues and the API is intact (still
a Bio::Taxon), but it's a fairly drastic change as Bio::Species is now
just a proxy. No idea on how this affects parsing speed either and it
hasn't been tested with Devel::Cycle, so benchmarks are welcome; I
would guess that it's slightly slower (extra object being generated),
something I can deal with as long as it doesn't suck memory away on
long parses.
Cheers!
chris
On Nov 24, 2008, at 10:59 PM, George Hartzell wrote:
> Chris Fields writes:
>>
>> On Nov 23, 2008, at 8:31 PM, George Hartzell wrote:
>>
>>> Chris Fields writes:
>>>> On Nov 23, 2008, at 6:19 AM, Sendu Bala wrote:
>>>>
>>>>> But anyway, thanks George, go ahead and commit if it all seems
>>>>> functional.
>>>>>
>>>>> If you could summarise your solution in the bug report(s?) as
>>>>> well,
>>>>> that would be great.
>>>>
>>>>
>>>> Agreed, though I think we need to ensure that memory leaks don't
>>>> permeate into Bio::Taxon when we make the switch (it shouldn't, I
>>>> think). BTW, checking Blame/Annotate it looks like some of this
>>>> stems
>>>> from r10974:
>>>>
>>>> http://code.open-bio.org/svnweb/index.cgi/bioperl/revision/?rev=10974
>>>>
>>>
>>> The change that I proposed wouldn't cause anything to permeate into
>>> Bio::Taxon.
>>>
>>> It's really a design problem with Bio::Species. A Species "has-a"
>>> reference to a tree *and* when the Species creates that tree it asks
>>> the tree constructor to initialize itself using the Species itself
>>> which results in the tree including a reference to that Species
>>> object. Et voila, circular reference.
>>
>> My thoughts as well.
>>
>>> It's not helped by the fact that Bio::Species explicitly disconnects
>>> all of the tree cleanup code that was supposed to handle those
>>> circular references. On the other hand there are a bunch of
>>> comments
>>> that suggest that the cleanup code never worked right.
>>
>> The cleanup code relies on the object being destroyed (either via
>> garbage collection or explicitly). Since there is a circular
>> dependency and DESTROY isn't explicitly called, the root cleanup
>> method won't work.
>>
>>> ...
>>> I'm not in front of that machine at the moment, but I think that
>>> there
>>> are a couple of other places in the src tree that use the -node or
>>> -root arg's to Bio::Tree::Tree::new and should probably be checked
>>> for
>>> leaks.
>>
>> A possible solution would be to make Bio::Species a shell or proxy
>> object that decorates a Bio::Taxon instance (and a Bio::Tree::Tree
>> where needed) and just delegates the appropriate methods to them. It
>> could still inherit from Bio::Taxon; it would just override the
>> Bio::Taxon methods for delegation. The Bio::Tree::Tree can refer to
>> the Bio::Taxon object (and vice versa), but neither refer back to the
>> Species object directly.
>>
>> In this case, when the last ref to the Bio::Species is released it
>> should be garbage collected and trigger DESTROY (which is set up in
>> Bio::Root::Root). This in turn triggers any cleanup methods, such as
>> calling DESTROY on the Tree and Taxon and get rid of memory leaks
>> (and
>> wouldn't require weaken/is_weak).
>> [...]
>
> How will this interact with the places that Bio::Species unhooks the
> various cleanup mechanisms that Tree sets up. There are comments that
> talk about the sub ref screwing with someone's use of Storable. It
> seems like if the new Species has cleanup code registered then there's
> a problem with whom-ever had a problem with it before.
>
>> I'm perfectly willing to test that out if no one else is (I am
>> working
>> on a few other bugs at the moment but I can probably get to it in the
>> next day or two). Not to tread on any toes, but I would like to get
>> this one taken care of prior to 1.6 and these are fairly nasty memory
>> issues. If there is any consolation it shouldn't be too hard to re-
>> rig the methods to delegate appropriately.
>>
>>> I don't really feel hot or bothered about this fix vs. the
>>> combination
>>> of weaken and stashing the subspecies. It depends on what the
>>> author
>>> intended for Species and its tree, neither are particularly sexy and
>>> if Species is really on its way out the door it may not matter.....
>>>
>>> g.
>>
>> I don't think it will matter post-1.6, but we'll have to support
>> Bio::Species for the 1.6 release series. So it will have to be
>> maintained for the short term, until 1.7 comes out.
>
> I can't poke at this stuff much until the weekend. If you can kick
> it, go for it!
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Marie-Claude Hofmann
College of Veterinary Medicine
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list