[Biojava-l] position weight matrix
Matthew Pocock
matthew_pocock at yahoo.co.uk
Fri Sep 19 07:28:34 EDT 2003
Hi Brian,
If you get an up-to-date copy of biojava from cvs (now) or the nightly
builds (tomorrow) then you will find that WeightMatrixAnnotator now
allows you to specify a ScoreType. The one you want is ScoreType.ODDS.
You will need a different threshold than you are used to - effectively
it's the log odds at which you accept a weight matrix match.
Let's get bayes: s is your sequence, m is your weight matrix
p(m|s)p(s) = p(s|m)p(m)
becomes...
p(m|s) = p(s|m) p(m) / p(s)
e.g. the probability of your weight matrix binding to a particular
position is the score of the weight matrix at that position multiplied
by the ratio of how much you belive the weight matrix and the sequence.
Of course, in this context, that ratio is fairy meaningless - just what
is p(m) anyway?
We could re-write p(s) in terms of it being produced by your weight
matrix or the null model, at which this nuisance term becomes the
threshold at which you should accept a match. So, if you think there
should only be one site every 1000 nt, then p(m) / p(s) can be set to
1/1000, take the log of this, and that's your threshold.
That's a noddy explanation, but it's sort of how bioinformatics does
these things, so hey ho.
Matthew
More information about the Biojava-l
mailing list