[Biojava-l] Translations
Matthew Pocock
mrp@sanger.ac.uk
Wed, 17 Oct 2001 10:49:06 +0100
Hi Armin,
Armin Groll wrote:
> Hi out there,
>
> I have some issues regarding interface org.biojava.symbol.Translation.
> I am missing something. Exactly that:
> - we have TranslationTable, that is a one way function.
> - we have ReversibleTranslationTable, that is a two way function.
> - I'd need some ambiguity possibility. What I mean is, let there exist
> translations from one symbol of alphabet A to a choice of symbols of
> alphabet B. And reverse.
> Something like this:
>
> package org.biojava.bio.symbol;
>
> interface AmbiguousTranslationTable{
> public Alphabet getSource();
> public Alphabet getTarget();
> /** @return a set of Symbols, each of them one possible translation of
> 'toTranslate'
> public java.util.Set getTranslation(Symbol toTranslate);
> }
This is a case that I hadn't thought of. The short answer is 'yes' -
this would be a legal translation table, but it would not be a legal
ReversibleTranslationTable (injective/surjective won't work if single
items map to sets - it's sort of a relation rather than a map). The
translation method doesn't need to return a Set, as we can already
return a Symbol that represents a set of indivisible symbols (this is
how ambiguities like N are handled).
>
> Hold on, I need something more liberal:
>
> interface AmbiguousSequenceTranslationTable{
> public Alphabet getSource();
> public Alphabet getTarget();
> /**
> * @return a set of org.biojava.bio.symbol.SymbolLists, each SymbolList a
> possible translation for the symbol
> * of source into the target alphabet.
> */
> public java.util.Set getTranslation(Symbol toTranslate);
> }
>
> Yes, and then, one can get all the wobbled codons of an amino acid through
> the amino acid's Symbol-instance.
> This sounds poor, but please get exotic: For some organisms, we have
> different translations there.
> And, as I am working on cytogenetics, I can translate different
> loci-alphabets into each other.
>
> Would this be possible (and feasible)?
>
> Armin
>
There is a little confusion here about SymbolList, Alphabet and Symbol.
An Alphabet is a set of Symbols. Symbols can be ambiguous or indivisible
(atomic, unambiguous). Therefore, an Alphabet is realy defined by the
set of atomic symbols that it contains. A SymbolList is a list of
symbols from an Alphabet. For example, a DNA sequence will have the DNA
alphabet and each symbol will either be one of {a,g,c,t}, gap, or an
ambiguity over these like {a,t}.
You can build more complex alphabets like all DNA triplets from simpler
ones. DNA triplets will contain atomic symbols like [a,a,a], [a,a,t] ...
[g,g,g]. These are not SymbolLists, but symbols. You can construct a
SymbolList of triplets if you want and call this codons. You could also
construct a SymbolList of overlapping triplets and call this a 3rd order
view. You could also build a SymbolList of triplets and call it an
alignment (where the 2nd sequence in the alignment can be read out by
looking at the 2nd component of each symbol).
Does that make anyting any clearer? Bother me if not.
Matthew
>
>
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
>