[Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols  
    Jesse 
    jesse-t at chello.nl
       
    Mon Jun  6 06:08:03 EDT 2005
    
    
  
Hi Cor,
Thanks for your reply.
I corrected the pattern by doing the following.
When BioJava's org.biojava.bio.molbio.RestrictionEnzyme.forwardRegex()
returns the regex of a RestrictionEnzyme "gtakm" it will return
"gta[gtk][acm]". In which k (G or T) and m (A or C) are ambiguous.
So the ambiguous symbol "k" is converted ambiguous "[gtk]", by putting the
"k" in the brackets.
I simply solved it by removed all ambiguous symbols from the returned regex
string.
String searchPattern = re.getForwardRegex().replaceAll("[rymkswbdhvn]", "");
Regards,
Jesse
-----Original Message-----
From: Cor 
Subject: RE: [Biojava-l] [1.4pre1] BioJava's-Regex with ambigous symbols 
Hi Jesse, 
Although I am a newbie myself, I have written some example code based on 
existing BioJava-testcode :
String symbols = "atgcgacgtcttaannnnnnatgcaac";
SymbolList sl = DNATools.createDNA(symbols);
String patternString = "g[ag]cg[ct]c"; 
PatternFactory fact = PatternFactory.makeFactory(DNATools.getDNA()); 
 Pattern pattern = fact.compile(patternString); 
 Matcher matcher = pattern.matcher(sl);
if (matcher.find()) {
 	System.out.println("match found");
     }
 else {
 fail("failed to find target ");
 }
	
In the pattern, you have to use [ag] in stead of [agr]. Otherwise you will
get 
the error:
 org.biojava.utils.regex.RegexException: all variant symbols must be atomic.
at 
org.biojava.utils.regex.PatternChecker.parseVariantSymbols(PatternChecker.ja
va:363)
Regards,
Cor
    
    
More information about the Biojava-l
mailing list