[Biojava-l] position weight matrix
Matthew Pocock
matthew_pocock at yahoo.co.uk
Thu Sep 18 05:14:36 EDT 2003
Schreiber, Mark wrote:
>Brian,
>
>I think we should change WeightMatrix so that N only scores a quater match of A where as R would score a half match.
>
>Do others feel this is sensible?
>
>- Mark
>
>
We should be taking the odds score between the weight matrix matching at
that pos and the null model - that way, ambiguity symbols devide out.
I'm tied up today (writing lectures - damn those students), but it
should be easy to modify WeightMatrixAnnotator to accept a ScoreType
instance.
For reasons that we can argue about on -dev or off-line, you can't in
the general case just divide scores for columns containing an N by 4, or
portion them out relative to the null model. That's why the ScoreType
objects were added to the DP package.
Matthew
> -----Original Message-----
> From: Brian Cox [mailto:cox at mshri.on.ca]
> Sent: Thu 18/09/2003 2:35 p.m.
> To: Schreiber, Mark
> Cc:
> Subject: RE: [Biojava-l] position weight matrix
>
>
>
> Thanks,
> That sounds like what I was looking for, I wanted to penalize the use of an
> N in the sequence. Not sure yet how to implement this but I'll give it a
> shot.
> thanks for the reply,
> BRian
>
> -----Original Message-----
> From: Schreiber, Mark
> To: Brian Cox; biojava-l at biojava.org
> Sent: 9/17/03 7:45 PM
> Subject: RE: [Biojava-l] position weight matrix
>
> Hi Brian,
>
> Technically this is correct as N or X do actually match everything. Are
> wanting to rule out any motif with an N or are you wanting to penalize a
> motif with an N (or other ambiguity)?
>
> If you are working with DNA you could use
> org.biojava.bio.seq.NucleotideTools, this class can be used to access
> the nucleotide alphabet that treats all symbols as Atomic, even if they
> are normally IUPAC ambiguity symbols. If you did this and set the weight
> of N in the marix to 0.0 it would exclude those motifs.
>
> - Mark
>
>
>
> -----Original Message-----
> From: Brian Cox [mailto:cox at mshri.on.ca]
> Sent: Friday, 12 September 2003 11:07 a.m.
> To: biojava-l at biojava.org
> Subject: [Biojava-l] position weight matrix
>
>
> I wrote a program to find TF binding sites using a
> WeightMatrixAnnotator, but when I try to annotate a sequence if the
> sequences has any N or X then everything matches. How do I get the
> WeightMatrixAnnotator to ignore the Ns or Xs?
> thanks,
> Brian Cox
> Samuel Lunenfeld Research Institute
> Mount Sinai Hospital, Rm 884
> Toronto, Ontario
> Canada
>
> 416-586-8266
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
>
>
>=======================================================================
>Attention: The information contained in this message and/or attachments
>from AgResearch Limited is intended only for the persons or entities
>to which it is addressed and may contain confidential and/or privileged
>material. Any review, retransmission, dissemination or other use of, or
>taking of any action in reliance upon, this information by persons or
>entities other than the intended recipients is prohibited by AgResearch
>Limited. If you have received this message in error, please notify the
>sender immediately.
>=======================================================================
>
>_______________________________________________
>Biojava-l mailing list - Biojava-l at biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
>
>
>
More information about the Biojava-l
mailing list