[Bioperl-l] Hidden Markov Model in Bioperl?

Yee Man Chan ymc at paxil.stanford.edu
Mon Mar 28 12:53:03 EST 2005



On Sun, 27 Mar 2005, Hilmar Lapp wrote:

> Sounds like a cool thing to have in bioperl.
> 
> Just one minor comment for naming, in perl/bioperl we typically 
> DontUseCapitatilization to delineate words (like in Java) but put 
> underscores. 

That's fine with me. I can use underscores.

Regards,
Yee Man

> Otherwise to my knowledge you're breaking new ground here 
> so there is no consistency check with the rest of bioperl to be passed, 
> unless I'm missing something.
> 
> 	-hilmar
> 
> On Friday, March 25, 2005, at 03:49  PM, Yee Man Chan wrote:
> 
> >
> > Hi all
> >
> > 	I just wrote a C module to do Hidden Markov Model (HMM) related
> > calculations. I find that there is no HMM implementation anywhere 
> > (there
> > are parsers for HMMER output however) in Bioperl. I think maybe it 
> > will be
> > a good idea for me to add this module to Bioperl?
> >
> > 	I am thinking of an interface like this:
> >
> > Bio::Tools::HMM->new("symbols", "states")
> > - instantiate an HMM object with a string of symbols (each character
> > corresponds to one symbol) and a string of states. Other parameters of 
> > the
> > model is generated randomly. Good for starting a Baum-Welch training.
> >
> > Bio::Tools::HMM->new("symbols", "states", array of initial state
> > probabilities, matrix of state transition probabilities, matrix of
> > emission probabilities)
> > - similar to the one before but now we explicit assign the HMM 
> > parameters.
> >
> > Bio::Tools::HMM->ObsSeqProb("string of observed sequence")
> > - return the probability of an observed sequence.
> >
> > Bio::Tools::HMM->Viterbi("string of observed sequence")
> > - return a string of hidden sequence that maximize the probability of 
> > the
> > happening of the observed sequence.
> >
> > Bio::Tools::HMM->BaumWelchTraining(array of observed sequences)
> > - uses an array of observed sequences to find the HMM parameters that
> > locally maximizes the probabilities of these observed sequences. 
> > Optional
> > parameters can be passed to change the tolerance and maximum number of
> > iteration.
> >
> > Bio::Tools::HMM->StatisticalTraining(array of observed sequences, 
> > array of
> > hidden state sequences)
> > - when the hidden state sequence is also known, use it to determine the
> > parameter of an HMM using statistical method.
> >
> > Bio::Tools::HMM->getInitArray()
> > - return the array of initial state probabilities as an @array
> >
> > Bio::Tools::HMM->getStateMatrix()
> > - return the matrix of state transition probabilities as MatrixI
> >
> > Bio::Tools::HMM->getEmissionMatrix()
> > - return the matrix of emission probabilities as MatrixI
> >
> > 	This should cover the most HMM applications. What do you think? Do
> > you have other functions in mind?
> >
> > 	I already contributed Bio::Tools::dpAlign before, so I am not a
> > newbie. If someone thinks it is a good idea to have this in Bioperl, I 
> > can
> > work on it as soon as possible.
> >
> > Best Regards,
> > Yee Man
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> -- 
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
> 
> 



More information about the Bioperl-l mailing list