[Biojava-l] position weight matrix

Matthew Pocock matthew_pocock at yahoo.co.uk
Thu Sep 18 05:14:36 EDT 2003


Schreiber, Mark wrote:

>Brian,
> 
>I think we should change WeightMatrix so that N only scores a quater match of A where as R would score a half match.
> 
>Do others feel this is sensible?
> 
>- Mark
>  
>
We should be taking the odds score between the weight matrix matching at 
that pos and the null model - that way, ambiguity symbols devide out. 
I'm tied up today (writing lectures - damn those students), but it 
should be easy to modify WeightMatrixAnnotator to accept a ScoreType 
instance.

For reasons that we can argue about on -dev or off-line, you can't in 
the general case just divide scores for columns containing an N by 4, or 
portion them out relative to the null model. That's why the ScoreType 
objects were added to the DP package.

Matthew

>	-----Original Message----- 
>	From: Brian Cox [mailto:cox at mshri.on.ca] 
>	Sent: Thu 18/09/2003 2:35 p.m. 
>	To: Schreiber, Mark 
>	Cc: 
>	Subject: RE: [Biojava-l] position weight matrix
>	
>	
>
>	 Thanks,
>	That sounds like what I was looking for, I wanted to penalize the use of an
>	N in the sequence.  Not sure yet how to implement this but I'll give it a
>	shot.
>	thanks for the reply,
>	BRian
>	
>	-----Original Message-----
>	From: Schreiber, Mark
>	To: Brian Cox; biojava-l at biojava.org
>	Sent: 9/17/03 7:45 PM
>	Subject: RE: [Biojava-l] position weight matrix
>	
>	Hi Brian,
>	
>	Technically this is correct as N or X do actually match everything. Are
>	wanting to rule out any motif with an N or are you wanting to penalize a
>	motif with an N (or other ambiguity)?
>	
>	If you are working with DNA you could use
>	org.biojava.bio.seq.NucleotideTools, this class can be used to access
>	the nucleotide alphabet that treats all symbols as Atomic, even if they
>	are normally IUPAC ambiguity symbols. If you did this and set the weight
>	of N in the marix to 0.0 it would exclude those motifs.
>	
>	- Mark
>	
>	
>	
>	-----Original Message-----
>	From: Brian Cox [mailto:cox at mshri.on.ca]
>	Sent: Friday, 12 September 2003 11:07 a.m.
>	To: biojava-l at biojava.org
>	Subject: [Biojava-l] position weight matrix
>	
>	
>	I wrote a program to find TF binding sites using a
>	WeightMatrixAnnotator, but when I try to annotate a sequence if the
>	sequences has any N or X then everything matches.  How do I get the
>	WeightMatrixAnnotator to ignore the Ns or Xs?
>	thanks,
>	Brian Cox
>	Samuel Lunenfeld Research Institute
>	Mount Sinai Hospital, Rm 884
>	Toronto, Ontario
>	Canada
>	
>	416-586-8266
>	=======================================================================
>	Attention: The information contained in this message and/or attachments
>	from AgResearch Limited is intended only for the persons or entities
>	to which it is addressed and may contain confidential and/or privileged
>	material. Any review, retransmission, dissemination or other use of, or
>	taking of any action in reliance upon, this information by persons or
>	entities other than the intended recipients is prohibited by AgResearch
>	Limited. If you have received this message in error, please notify the
>	sender immediately.
>	=======================================================================
>	
>
>
>=======================================================================
>Attention: The information contained in this message and/or attachments
>from AgResearch Limited is intended only for the persons or entities
>to which it is addressed and may contain confidential and/or privileged
>material. Any review, retransmission, dissemination or other use of, or
>taking of any action in reliance upon, this information by persons or
>entities other than the intended recipients is prohibited by AgResearch
>Limited. If you have received this message in error, please notify the
>sender immediately.
>=======================================================================
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l at biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
>
>  
>




More information about the Biojava-l mailing list