[Biojava-l] position weight matrix

Schreiber, Mark mark.schreiber at agresearch.co.nz
Wed Sep 17 19:45:09 EDT 2003


Hi Brian,

Technically this is correct as N or X do actually match everything. Are wanting to rule out any motif with an N or are you wanting to penalize a motif with an N (or other ambiguity)?

If you are working with DNA you could use org.biojava.bio.seq.NucleotideTools, this class can be used to access the nucleotide alphabet that treats all symbols as Atomic, even if they are normally IUPAC ambiguity symbols. If you did this and set the weight of N in the marix to 0.0 it would exclude those motifs.

- Mark



-----Original Message-----
From: Brian Cox [mailto:cox at mshri.on.ca] 
Sent: Friday, 12 September 2003 11:07 a.m.
To: biojava-l at biojava.org
Subject: [Biojava-l] position weight matrix


I wrote a program to find TF binding sites using a WeightMatrixAnnotator, but when I try to annotate a sequence if the sequences has any N or X then everything matches.  How do I get the WeightMatrixAnnotator to ignore the Ns or Xs?
thanks,
Brian Cox
Samuel Lunenfeld Research Institute
Mount Sinai Hospital, Rm 884
Toronto, Ontario
Canada

416-586-8266
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list