[Biojava-l] SymbolParsers in Alphabet

Thomas Down td2@sanger.ac.uk
Wed, 11 Dec 2002 16:47:29 +0000


On Wed, Dec 11, 2002 at 08:19:14AM -0800, Ren, Zhen wrote:
> Thank you for the suggestion.  I made that work.  Here is another related question: how can  I make my own custom alphabet handle the ambiguity symbol X like what the predefined protein alphabet does?  Thanks.

You can bind a token to any ambiguity symbol:

   Set ambiSet = new HashSet();
   ambiSet.add(symbol1);
   ambiSet.add(symbol2);
   ambiSymbol = myAlphabet.getAmbiguity(ambiSet);
   toke.bindSymbol(ambiSymbol, 'Y');

If all you care about it the symbol which matches *all*
symbols in the alpabet (equivalnet to 'n' in DNA and 'X' in
protein), you can just do:

   toke.bindSymbol(
       AlphabetManager.getAllAmbiguitySymbol(myAlphabet),
       'X'
   );

For non-tiny alphabets, you probably won't want to define
a token for every possible ambiguity symbol.  In recent
CVS versions, if you try to retreive the token for an
unrecognized ambigutiy symbol, it will default to the token
used for the all-ambiguity symbol.

    Thomas.