[Biojava-l] Ambiguity consensus string search

Michael Heuer heuermh at acm.org
Sun Mar 11 21:58:04 UTC 2007


Hello Charles,

You may find the org.biojava.utils.regex package useful in this regard:

> http://www.biojava.org/docs/api15b/index.html

   michael


Charles Danko wrote:

> Hi,
>
> I'm trying to search a Sequence or SymbolList object for a consensus
> sequence that contains IUPac ambiguity codes.
>
> Without ambiguity codes, I could write a function that breaks the sequence
> into "windows" the size of the consensus, and checks each window for a
> match.  Am I missing a simple function that does this for me?
>
> Next, adding ambiguity codes ... do I have to define my own alphabet for the
> DNA IUPac codes, or are these already included in the distribution
> somewhere?
>
> I have found the weight matrix class, and realize that I could create one of
> these objects and calculate a threshold that will work in the same manner as
> a consensus, but this seems like a bit of a hack for some functionality I am
> most likely overlooking.
>
> Thanks very much!
>
> Charles




More information about the Biojava-l mailing list