[Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols

Jesse jesse-t at chello.nl
Thu Jun 2 09:40:27 EDT 2005


Can someone tell me how I can perform a BioJava 1.4pre1 regex search using
ambiguous symbols?

I'm using the following ambiguous DNA symbols:
(http://rebase.neb.com/rebase/link_withrefm)
-R = G or A
-Y = C or T
-M = A or C
-K = G or T
-S = G or C
-W = A or T
-B = not A (C or G or T)
-D = not C (A or G or T)
-H = not G (A or C or T)
-V = not T (A or C or G)
-N = A or C or G or T

If correct, to perform a BioJava-Regex, I need to make a PatternFactory
using the following method:

FiniteAlphabet fa = DNATools.getDNA();
org.biojava.utils.regex.PatternFactory.makeFactory(fa)

So I need a FiniteAlphabet containing ambiguous symbols right?
How can I make such FiniteAlphabet?

My goal is to perform a searchpattern like "g[agr]cg[cty]c" on a SymbolList
like "ATGCGACGTCTTAANNNNNNATGCAAC";

Thanks.

-Jesse



More information about the Biojava-l mailing list