[Bioperl-l] Performance of Bio::Species
Sendu Bala
bix at sendu.me.uk
Tue Nov 21 20:16:18 UTC 2006
Hilmar Lapp wrote:
>
> On Nov 21, 2006, at 2:37 PM, Sendu Bala wrote:
>
>> Anyway, for the memory leak I have some ideas I haven't tried yet; I
>> don't know if my efforts will solve the speed issue though.
>
> The memory leak sounds more concerning to me. Under which circumstances
> would it crash a script or blow throuhg all of say 1-2GB when it should
> have taken only a tenth of that.
Its been reported as causing problems if you do something like parse a
large embl file with many (10s of thousands) sequences in it. So
basically any situation that you make lots of Bio::Species objects.
IIRC the reporter ran out of memory on a ~40000 sequence embl file.
Neither the memory leak fix or speed fix ought to require any API
change. I'm fairly certain that the memory leak, at least, is confined
to a problem with (as suggested) Bio::Tree* stuff failing to clean up on
destruction.
There was in fact already an unnoticed problem with Bio::Tree::Node not
getting cleaned up (see my #*** comment in the code), but my
Bio::Species-related changes exacerbated the problem and also made them
noticeable, since you're more likely to create thousands of Bio::Species
than you were Bio::Tree::Node.
More information about the Bioperl-l
mailing list