[Biojava-l] SymbolParsers in Alphabet

Ren, Zhen zren@amylin.com
Wed, 11 Dec 2002 09:12:28 -0800


Great!  Thanks.

-----Original Message-----
From: Thomas Down [mailto:td2@sanger.ac.uk]
Sent: Wednesday, December 11, 2002 8:47 AM
To: Ren, Zhen
Cc: biojava-l@biojava.org
Subject: Re: [Biojava-l] SymbolParsers in Alphabet


On Wed, Dec 11, 2002 at 08:19:14AM -0800, Ren, Zhen wrote:
> Thank you for the suggestion.  I made that work.  Here is another related question: how can  I make my own custom alphabet handle the ambiguity symbol X like what the predefined protein alphabet does?  Thanks.

You can bind a token to any ambiguity symbol:

   Set ambiSet = new HashSet();
   ambiSet.add(symbol1);
   ambiSet.add(symbol2);
   ambiSymbol = myAlphabet.getAmbiguity(ambiSet);
   toke.bindSymbol(ambiSymbol, 'Y');

If all you care about it the symbol which matches *all*
symbols in the alpabet (equivalnet to 'n' in DNA and 'X' in
protein), you can just do:

   toke.bindSymbol(
       AlphabetManager.getAllAmbiguitySymbol(myAlphabet),
       'X'
   );

For non-tiny alphabets, you probably won't want to define
a token for every possible ambiguity symbol.  In recent
CVS versions, if you try to retreive the token for an
unrecognized ambigutiy symbol, it will default to the token
used for the all-ambiguity symbol.

    Thomas.