[Biojava-l] Error initializing uniform protein distribution

Schreiber, Mark mark.schreiber at agresearch.co.nz
Mon Jul 14 10:25:47 EDT 2003


Hi -

This is because the Protein alphabet contains 20 amino acids and
Selenocystein (SEC). Which admitidly is a bit of a trap for people new
to BioJava (or people new to the Protein Alphabet).

It is not generally a good idea to use a Uniform protein distribution
(other than for testing) as you won't see anything like that in nature.

- Mark

> -----Original Message-----
> From: hlr02 at doc.ic.ac.uk [mailto:hlr02 at doc.ic.ac.uk] 
> Sent: Sunday, 13 July 2003 10:04 p.m.
> To: biojava-l at biojava.org
> Subject: [Biojava-l] Error initializing uniform protein distribution
> 
> 
> Hi,
> 
> If I create a uniform protein distribution and check the emission 
> probabilities, I find that they are set to 1/21 rather than 1/20...
> 
> Henry Romijn
> 
> 
> FiniteAlphabet emissionAlpha =
>       (FiniteAlphabet) AlphabetManager.alphabetForName("PROTEIN");
> 
> ProfileHMM profile = new ProfileHMM(
>         emissionAlpha,
>         length,
>         DistributionFactory.DEFAULT,
>         DistributionFactory.DEFAULT,
>         name
>       );
> 
> ModelTrainer mt = new SimpleModelTrainer(); 
> mt.registerModel(profile); mt.setNullModelWeight(1.0);
> 
> // output
> PRO 0.047619047619047596
> ARG 0.047619047619047596
> CYS 0.047619047619047596
> GLU 0.047619047619047596
> VAL 0.047619047619047596
> GLY 0.047619047619047596
> GLN 0.047619047619047596
> HIS 0.047619047619047596
> ASN 0.047619047619047596
> THR 0.047619047619047596
> SEC 0.047619047619047596
> LYS 0.047619047619047596
> PHE 0.047619047619047596
> LEU 0.047619047619047596
> ASP 0.047619047619047596
> SER 0.047619047619047596
> TRP 0.047619047619047596
> MET 0.047619047619047596
> TYR 0.047619047619047596
> ALA 0.047619047619047596
> ILE 0.047619047619047596
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at biojava.org 
> http://biojava.org/mailman/listinfo/biojava-l
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list