[Biojava-l] Translations

Matthew Pocock mrp@sanger.ac.uk
Wed, 17 Oct 2001 10:49:06 +0100


Hi Armin,

Armin Groll wrote:

> Hi out there,
> 
> I have some issues regarding interface org.biojava.symbol.Translation.
> I am missing something. Exactly that:
> - we have TranslationTable, that is a one way function.
> - we have ReversibleTranslationTable, that is a two way function.
> - I'd need some ambiguity possibility. What I mean is, let there exist
> translations from one symbol of alphabet A to a choice of symbols of
> alphabet B. And reverse.
> Something like this:
> 
> package org.biojava.bio.symbol;
> 
> interface AmbiguousTranslationTable{
>   public Alphabet getSource();
>   public Alphabet getTarget();
>   /** @return a set of Symbols, each of them one possible translation of
> 'toTranslate'
>   public java.util.Set getTranslation(Symbol toTranslate);
> }


This is a case that I hadn't thought of. The short answer is 'yes' - 
this would be a legal translation table, but it would not be a legal 
ReversibleTranslationTable (injective/surjective won't work if single 
items map to sets - it's sort of a relation rather than a map). The 
translation method doesn't need to return a Set, as we can already 
return a Symbol that represents a set of indivisible symbols (this is 
how ambiguities like N are handled).

> 
> Hold on, I need something more liberal:
> 
> interface AmbiguousSequenceTranslationTable{
>   public Alphabet getSource();
>   public Alphabet getTarget();
>   /**
>   * @return a set of org.biojava.bio.symbol.SymbolLists, each SymbolList a
> possible translation for the symbol
>   * of source into the target alphabet.
>   */
>   public java.util.Set getTranslation(Symbol toTranslate);
> }
> 
> Yes, and then, one can get all the wobbled codons of an amino acid through
> the amino acid's Symbol-instance.
> This sounds poor, but please get exotic: For some organisms, we have
> different translations there.
> And, as I am working on cytogenetics, I can translate different
> loci-alphabets into each other.
> 
> Would this be possible (and feasible)?
> 
> Armin
> 


There is a little confusion here about SymbolList, Alphabet and Symbol. 
An Alphabet is a set of Symbols. Symbols can be ambiguous or indivisible 
(atomic, unambiguous). Therefore, an Alphabet is realy defined by the 
set of atomic symbols that it contains. A SymbolList is a list of 
symbols from an Alphabet. For example, a DNA sequence will have the DNA 
alphabet and each symbol will either be one of {a,g,c,t}, gap, or an 
ambiguity over these like {a,t}.

You can build more complex alphabets like all DNA triplets from simpler 
ones. DNA triplets will contain atomic symbols like [a,a,a], [a,a,t] ... 
[g,g,g]. These are not SymbolLists, but symbols. You can construct a 
SymbolList of triplets if you want and call this codons. You could also 
construct a SymbolList of overlapping triplets and call this a 3rd order 
view. You could also build a SymbolList of triplets and call it an 
alignment (where the 2nd sequence in the alignment can be read out by 
looking at the 2nd component of each symbol).

Does that make anyting any clearer? Bother me if not.

Matthew


> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
>