[Biojava-l] Java Resource Management [a semi troll...]

Tue Feb 11 13:52:54 EST 2003

>>>>> "Matthew" == Matthew Pocock <matthew_pocock at yahoo.co.uk> writes:

  >> In short, using all the memory you possibly can, but without
  >> actually crashing the JVM is fairly hard in Java. Or, if I am
  >> more honest, I never found a good way of doing it. Perhaps
  >> someone else has.  Cheers

  >> Phil

  >> _______________________________________________ Biojava-l mailing
  >> list - Biojava-l at biojava.org
  >> http://biojava.org/mailman/listinfo/biojava-l
  >>

  Matthew> Internaly, BioJava uses several mechanisms for caching
  Matthew> data. Like you, since hotspot, we've found that caches get
  Matthew> agressively cleared which defeats the point of them. The
  Matthew> reference objects are quite usefull for canonicalizing
  Matthew> objects (e.g. fetching features from a DB if there is no
  Matthew> feature object in existance that represents it) and for
  Matthew> knowing that they are going away.

  Matthew> Caches and/or reference objects are used everywhere from DB
  Matthew> access to dynamic programming to the event system to large
  Matthew> alphabets. They nearly do what we need - if hotspot wasn't
  Matthew> so agressive with them, we would have prety much all the
  Matthew> fexibility we could want for memory management.

I tend to agree. I think that the solution is probably to have another
reference class which has more guaranteed behaviour under GC (i.e. so
my hypothetical class
ModeratelySoftButSlightlyHardAtLeastUnderSomeCircumstancesReference,
would guarantee only to GC when MemoryOutOfError was imminent, rather
than guarantee to GC before it happens. How easy this would be to
implement, I do not know, but I suspect it would be quite hard). Then
you could build better caches. 

  Matthew> Much of the startup overhead could be solved by having the
  Matthew> vm sign that core classes have been varified in the past,
  Matthew> and prove that they haven't been altered since. If the
  Matthew> hotsopot VMs used a similar trick to dump out 'snapshots'
  Matthew> of optimized code, we could avoid every java process under
  Matthew> the sun performing the same n optimizations on xerces
  Matthew> parsers, and perhaps shave off some of that biojava
  Matthew> load-time.

This might be so, although some of the optimisations are fragile to
later class loading. So if for instance you have a class with three
methods, and these are always chained like so...

a(){
  return b;
}

b(){
  return c;
}

c();
  return blah;
}

Hotspot will inline all three. If another class comes along and calls
b() directly, it all goes pear shaped, as b() has been optimised
away. Hotspot, I believe, backs the optimisation out, and back in
again.

However some sort of dumping should be possible. Emacs does this for
instance, by dumping out the actual executable, which is a lisp VM,
plus a whole load of lisp functions in memory. This speeds up start up
from around 10 minutes, to a second or two. I suspect that there are
good reasons for not doing it though....

Cheers

Phil