[Biojava-l] NameParser

Matthew Pocock mrp@sanger.ac.uk
Tue, 14 Nov 2000 19:03:36 +0000


Hi Robin,

Just to make sure we are on the same page, you are sudgesting that END, TER and
STP all be legal names for a single termination symbol in the
protein-with-termination alphabet (retrievable from
ProteinTools.getTAlphabet()). The codon tables map from DNA^3 to
protein-with-termination, and the codon-bias tables give you a distribution over
DNA^3 for a given protein-with-termination symbol.

I sudgest a three-part solution.

1) Add a method to NameParser that lets you associate a name with a symbol. It
will look something like:

addSymbolForName(String name, Symbol sym) throws IllegalSymbolException;

It will add a map from name to sym, assuming that sym is within the alphabet for
the parser, and that name is not currently in use in that parser. You may wish
to add the corresponding remove method for breaking associations.

2) Add a 'synonym'(sp?) element to the AlphabetManager.xml resource, and to the
termination symbol add the synonyms.

3) Modify AlphabetManager.java so that it adds the synonyms to the name parser.

Does this sound do-able, or is it a bit complex?

All the best,

Matthew

"Emig, Robin" wrote:

>         Is the best way to deal situations where multiple tokens(or name)
> are really the same Symbol is to SubClass NameParser and add checks in it
> that symply map the redundant names to a proper unique one, and then parse.
>         The reason I ask is that I am reading in CodonBiasTables which often
> have END TER or STP as the stop/terminal codon. I don't mind representing
> all of these as the same symbol, because they are in my case, but I wanted
> to know if there was a better way to do this, such as editing/creating and
> alphabet to do this. I was thinking of also creating possible a translation
> alphabet, essentially something that could set up all the mappings for a
> java.Map.
> -Robin
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l