[Biojava-dev] Comments about OrderNDistributions

Schreiber, Mark mark.schreiber at agresearch.co.nz
Tue Mar 4 14:09:30 EST 2003


Hi,

To do a joint probability you just need to create a standard
Distribution over the (DNA x DNA x DNA) alphabet (if you want codons).
The OrderNDistribution was created specifically to handle conditional
probability (and probably should have a different name).

- Mark


> -----Original Message-----
> From: Francois Pepin [mailto:fpepin at cs.mcgill.ca] 
> Sent: Tuesday, 4 March 2003 2:06 p.m.
> To: biojava-dev at biojava.org
> Subject: [Biojava-dev] Comments about OrderNDistributions
> 
> 
> After going through the code for the OrderNDistributions, 
> there are a couple of comments and questions that I would have:
> 
> Is there any reason why the conditional probabilities instead 
> of joint probabilities are used there?
> 
> Right now, for OrderNDistribution.getWeight(cgt) (or any codon) gives
> P(t|cg) while getting P(cgt) would be a lot more useful. It's 
> quite easy to go from the joint to the conditional 
> probabilities while getting the opposite information is 
> pretty troublesome.
> 
> To get P(cgt), one would need to get P(t|cg)*sum of 
> P(g|nc)*sum of P(c|nn). (sum of 
> P(g|nc)=P(g|ac)+P(g|cc)+P(g|gc)+P(g|tc) ).
> 
> I don't really see why not store it as joint probabilities 
> and not have to worry about the conditioning and conditioned 
> alphabets there.
> 
> Also, I'm not positive about this, but it seems that some 
> information would be lost (or at least quite difficult to 
> recover) about the first few characters of the distribution, 
> for example with AACCCGGG, it there are no A's that would 
> appear anywhere in a 3rd order (which would really be a 2nd 
> order Markov chain) distributions. Two ways of going around 
> it would be to carry all of the distributions of lower order 
> to make sure that you have the data around, but it's a bit 
> cumbersome, or to have the 
> SymbolListViews.orderNSymbolList(AACCCGGG, 3) give out 
> NNAACCCGGG in this case, and have the orderNDistributions 
> keep that into account.
> 
> What do people think about this?
> 
> Francois Pepin
> 
> 
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org 
> http://biojava.org/mailman/listinfo/biojava-dev
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the biojava-dev mailing list