[Biojava-l] position weight matrix

Matthew Pocock matthew_pocock at yahoo.co.uk
Fri Sep 19 07:28:34 EDT 2003


Hi Brian,

If you get an up-to-date copy of biojava from cvs (now) or the nightly 
builds (tomorrow) then you will find that WeightMatrixAnnotator now 
allows you to specify a ScoreType. The one you want is ScoreType.ODDS. 
You will need a different threshold than you are used to - effectively 
it's the log odds at which you accept a weight matrix match.

Let's get bayes: s is your sequence, m is your weight matrix

p(m|s)p(s) = p(s|m)p(m)

becomes...

p(m|s) = p(s|m) p(m) / p(s)

e.g. the probability of your weight matrix binding to a particular 
position is the score of the weight matrix at that position multiplied 
by the ratio of how much you belive the weight matrix and the sequence. 
Of course, in this context, that ratio is fairy meaningless - just what 
is p(m) anyway?

We could re-write p(s) in terms of it being produced by your weight 
matrix or the null model, at which this nuisance term becomes the 
threshold at which you should accept a match. So, if you think there 
should only be one site every 1000 nt, then p(m) / p(s) can be set to 
1/1000, take the log of this, and that's your threshold.

That's a noddy explanation, but it's sort of how bioinformatics does 
these things, so hey ho.

Matthew




More information about the Biojava-l mailing list