[Biojava-dev] [BioJava - Bug #3345] (Closed) Static object cache in SimpleRichObjectBuilder causing memory leak

Tjeerd Boerman twboerman at gmail.com
Thu Apr 26 20:36:01 UTC 2012


Wow. I'm quite emberassed, but the issue is actually not fixed in 1.8.2. 
The RichObjectFactory class uses an LRU cache, while the 
SimpleRichObjectBuilder class uses the cache described in my earlier 
post, and I mixed up the two. Sorry about that, I must have been staring 
at this code for too long today.

It turns out the RichObjectFactory class is the only user of 
SimpleRichObjectBuilder, and has its own LRU cache. So the cache in 
SimpleRichObjectBuilder can probably be removed altogether. If the 
experts agree I would be happy to write a patch for this.

Can the old bug report be reopened, or should I file a new bug report? 
Or should I submit a possible patch through this mailing list?

Regards,
Tjeerd

On 4/26/2012 6:01 PM, redmine at redmine.open-bio.org wrote:
> Issue #3345 has been updated by Andreas Prlic.
>
>   * Status changed from New to Closed
>   * % Done changed from 0 to 100
>
> already fixed in 1.8.2
>
> ------------------------------------------------------------------------
>
>
>   Bug #3345: Static object cache in SimpleRichObjectBuilder causing
>   memory leak <https://redmine.open-bio.org/issues/3345>
>
>   * Author: Tjeerd Boerman
>   * Status: Closed
>   * Priority: Normal
>   * Assignee: biojava-dev list
>   * Category: bio
>   * Target version: BioJava 1.8 - legacy
>   * URL:
>
> I encountered a memory problem when parsing many Genbank files with 
> the Biojava 1.8.1. The parsed files were protein GPFF (GenPept Flat 
> File format) files from the latest RefSeq release. The application 
> tried to parse millions of protein sequences from these files, but an 
> OutOfMemoryException would always occur after some time. The used heap 
> space would gradually increase from a couple hundred megabytes to over 
> 1.5 GB, until the heap could grow no further. Upon inspection I 
> discovered a HashMap in RichSequenceBuilder was the culprit:
>
> public class SimpleRichObjectBuilder implements RichObjectBuilder {
>
>      private static Map objects = new HashMap();
>
>      public Object buildObject(Class clazz, List paramsList) {
>          ...
>
>          // return the constructed object from the hashmap if there already
>          if (contents.containsKey(ourParamsList)) return contents.get(ourParamsList);
>
>          ...
>
>          // Instantiate it with the parameters
>          Object o = c.newInstance(ourParamsList.toArray());
>
>          // store it for later in the singleton map
>          contents.put(ourParamsList, o);
>
>          ...
>      }
> }
>
> It seems the *objects* Map in SimpleRichSequenceBuilder is used as a 
> static cache for objects, but when many different objects are created 
> this cache grows out of control. I am unsure if this is a 'true' bug, 
> but for my application it was a definite problem. My fix was to simply 
> comment out the *contents.put()* statement, but I'm sure there is a 
> better way to resolve this - perhaps by making the use of the cache 
> optional through a configuration option.
>
> ------------------------------------------------------------------------
>
> You have received this notification because you have either subscribed 
> to it, or are involved in it.
> To change your notification preferences, please click here and login: 
> http://redmine.open-bio.org
>



More information about the biojava-dev mailing list