[Bioperl-l] Hidden Markov Model in Bioperl?
Yee Man Chan
ymc at paxil.stanford.edu
Mon Mar 28 12:53:03 EST 2005
On Sun, 27 Mar 2005, Hilmar Lapp wrote:
> Sounds like a cool thing to have in bioperl.
>
> Just one minor comment for naming, in perl/bioperl we typically
> DontUseCapitatilization to delineate words (like in Java) but put
> underscores.
That's fine with me. I can use underscores.
Regards,
Yee Man
> Otherwise to my knowledge you're breaking new ground here
> so there is no consistency check with the rest of bioperl to be passed,
> unless I'm missing something.
>
> -hilmar
>
> On Friday, March 25, 2005, at 03:49 PM, Yee Man Chan wrote:
>
> >
> > Hi all
> >
> > I just wrote a C module to do Hidden Markov Model (HMM) related
> > calculations. I find that there is no HMM implementation anywhere
> > (there
> > are parsers for HMMER output however) in Bioperl. I think maybe it
> > will be
> > a good idea for me to add this module to Bioperl?
> >
> > I am thinking of an interface like this:
> >
> > Bio::Tools::HMM->new("symbols", "states")
> > - instantiate an HMM object with a string of symbols (each character
> > corresponds to one symbol) and a string of states. Other parameters of
> > the
> > model is generated randomly. Good for starting a Baum-Welch training.
> >
> > Bio::Tools::HMM->new("symbols", "states", array of initial state
> > probabilities, matrix of state transition probabilities, matrix of
> > emission probabilities)
> > - similar to the one before but now we explicit assign the HMM
> > parameters.
> >
> > Bio::Tools::HMM->ObsSeqProb("string of observed sequence")
> > - return the probability of an observed sequence.
> >
> > Bio::Tools::HMM->Viterbi("string of observed sequence")
> > - return a string of hidden sequence that maximize the probability of
> > the
> > happening of the observed sequence.
> >
> > Bio::Tools::HMM->BaumWelchTraining(array of observed sequences)
> > - uses an array of observed sequences to find the HMM parameters that
> > locally maximizes the probabilities of these observed sequences.
> > Optional
> > parameters can be passed to change the tolerance and maximum number of
> > iteration.
> >
> > Bio::Tools::HMM->StatisticalTraining(array of observed sequences,
> > array of
> > hidden state sequences)
> > - when the hidden state sequence is also known, use it to determine the
> > parameter of an HMM using statistical method.
> >
> > Bio::Tools::HMM->getInitArray()
> > - return the array of initial state probabilities as an @array
> >
> > Bio::Tools::HMM->getStateMatrix()
> > - return the matrix of state transition probabilities as MatrixI
> >
> > Bio::Tools::HMM->getEmissionMatrix()
> > - return the matrix of emission probabilities as MatrixI
> >
> > This should cover the most HMM applications. What do you think? Do
> > you have other functions in mind?
> >
> > I already contributed Bio::Tools::dpAlign before, so I am not a
> > newbie. If someone thinks it is a good idea to have this in Bioperl, I
> > can
> > work on it as soon as possible.
> >
> > Best Regards,
> > Yee Man
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp email: lapp at gnf.org
> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
More information about the Bioperl-l
mailing list