[Biojava-l] IntegerAlphabet IntegerSymbol

Thomas Down td2@sanger.ac.uk
Mon, 22 Oct 2001 19:31:14 +0100


On Mon, Oct 22, 2001 at 10:36:22AM +1300, Mark Schreiber wrote:
>
> > 4) get rid of getToken() completely, and change the way that sequences
> >    get converted to strings -- replacing hardwired code in SymbolList
> >    implementations with pluggable `stringifiers'.
> > 
> > This was the idea of my SymbolTokenizations patch which I posted
> > a few days ago.  Certainly my view is that is provides a much
> > cleaner framework for handling this kind of situtation, and I'd
> > urge you to take a look.
> > 
> >     Thomas
> 
> I like option 4 

Okay, that now makes two comments in favour of SymbolTokenizations,
and one neutral.  More feedback welcome!

> although I would advocate deprecating getToken() instead
> until everyone gets their applications up to speed with
> SymbolTokenizations.

That's possible, but does mean we lose a rather satisfying
code cleanup which can with SymbolTokenizations, viz., it's
no longer necessary to pre-initialize certain alphabets (e.g.
DNA) with `well-known' ambiguity symbols.  There's also the
issue that there are a lot of cases where it just wasn't possible
to sensibly implement getToken() [which to me was always the most
compelling argument for its removal].

Can we have a quick straw-poll of who uses getToken()?  My experience
is that it doesn't get usused very much at all -- if you want
to print out information about a single Symbol, getName() is
a better choice.  What DOES get used a fair amount is
SymbolList.seqString().  That's still there, and is usually
implemented as something like:

   getAlphabet().getTokenization("default").tokenizeSymbolLIst(this);

But anyway, there's still time to reconsider on this,

    Thomas.