[Biojava-l] unsupervised training of transition weights

Fri Mar 31 10:58:38 UTC 2006

On 30 Mar 2006, at 16:41, wendy wong wrote:

> Hi,
>
> I am trying to train my HMM using unsupervised training (I don't need
> to train the emission probabilities). I was wondering how I can do so
> in biojava. do I have to implement the TransitionTrainer interface?

The easiest way to do this is to use UntrainableDistributions for all  
the transition-sets that you don't want to be trained:

         http://www.biojava.org/docs/api14/org/biojava/bio/dist/ 
UntrainableDistribution.html

If UntrainableDistribution doesn't fit your requirements, the  
alternative is to create your own Distribution implementation with a  
registerTrainer method that creates a "dummy" (i.e. doesn't do  
anything) DistributionTrainer.  UntrainableDistribution is just a  
subclass of SimpleDistribution which replaces the registerTrainer  
method with a non-functional version.

> my second question is:
> I implemnted getWeightImpl in my custom distribution to set up my
> emission states and it works fine. but is it possible to get the
> program to access it only when there's certain symbol in the observed
> sequence, (instead of precalculated)? and I also found that (although
> I might be wrong) the weights are calculated twice, once was when the
> distribution was created, and then when I call viterbi it calls
> getWeightImpl again. I am not sure what I did wrong here :(

The DP code does some caching of probabilities, I don't think there's  
any way to turn this off without modifying the DP implementations.

           Thomas.