[Biojava-l] adding counts to Dists

Mark Schreiber mark_s@sanger.otago.ac.nz
Thu, 13 Dec 2001 22:33:52 +1300 (NZDT)


Hi -

When adding a large number of counts to a Distribution via a trainer i
have found it is much quicker to store the counts in and array (indexed by
the AlphabetIndex for that alphabet). Increment the counts as each symbol
comes in and then add the counts to the trainer at the end. (followed by
the .train() method).

I'm curious as to why this is. I assume its cause the trainer checks the
validity of each symbol, although technically so does the AlphabetIndex by
looking up the index for the symbol.

Not that this is a major issue it might just be a way to speed up
distribution training

Mark

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mark Schreiber			Ph: 64 3 4797875
Rm 218				email mark_s@sanger.otago.ac.nz
Department of Biochemistry
University of Otago		
PO Box 56
Dunedin
New Zealand
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~