[Biojava-dev] [BioJava - Bug #3345] (New) Static object cache in SimpleRichObjectBuilder causing memory leak

Sergio Pulido spulido99 at gmail.com
Thu Apr 26 12:58:12 UTC 2012


If the cache is necessary a simple LRU cache can be used instead of the
HashMap.

This implementation should be more than enough:
http://www.source-code.biz/snippets/java/6.htm

I see another possible problem there, a ConcurrentModificationException if
more than one "rich object" is trying to be built at a time by different
threads.

Sergio.

On Thu, Apr 26, 2012 at 2:08 PM, <redmine at redmine.open-bio.org> wrote:

>
> Issue #3345 has been reported by Tjeerd Boerman.
>
> ----------------------------------------
> Bug #3345: Static object cache in SimpleRichObjectBuilder causing memory
> leak
> https://redmine.open-bio.org/issues/3345
>
> Author: Tjeerd Boerman
> Status: New
> Priority: Normal
> Assignee: biojava-dev list
> Category: bio
> Target version: BioJava 1.8 - legacy
> URL:
>
>
> I encountered a memory problem when parsing many Genbank files with the
> Biojava 1.8.1. The parsed files were protein GPFF (GenPept Flat File
> format) files from the latest RefSeq release. The application tried to
> parse millions of protein sequences from these files, but an
> OutOfMemoryException would always occur after some time. The used heap
> space would gradually increase from a couple hundred megabytes to over 1.5
> GB, until the heap could grow no further. Upon inspection I discovered a
> HashMap in RichSequenceBuilder was the culprit:
>
> <pre>
> public class SimpleRichObjectBuilder implements RichObjectBuilder {
>
>    private static Map objects = new HashMap();
>
>    public Object buildObject(Class clazz, List paramsList) {
>        ...
>
>        // return the constructed object from the hashmap if there already
>        if (contents.containsKey(ourParamsList)) return
> contents.get(ourParamsList);
>
>        ...
>
>        // Instantiate it with the parameters
>        Object o = c.newInstance(ourParamsList.toArray());
>
>        // store it for later in the singleton map
>        contents.put(ourParamsList, o);
>
>        ...
>    }
> }
> </pre>
>
> It seems the *objects* Map in SimpleRichSequenceBuilder is used as a
> static cache for objects, but when many different objects are created this
> cache grows out of control. I am unsure if this is a 'true' bug, but for my
> application it was a definite problem. My fix was to simply comment out the
> *contents.put()* statement, but I'm sure there is a better way to resolve
> this - perhaps by making the use of the cache optional through a
> configuration option.
>
>
> --
> You have received this notification because you have either subscribed to
> it, or are involved in it.
> To change your notification preferences, please click here and login:
> http://redmine.open-bio.org
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list