[Biojava-l] single letter code from protein ambiguities?

community at struck.lu community at struck.lu
Mon May 19 09:35:54 UTC 2008


Thank you for pointing me in the right
direction:##################################
            SymbolList
symL =
DNATools.createDNA("atnatggnnatg");           
SymbolList symL2 =
DNATools.toProtein(symL);           
System.out.println(symL2.seqString());           
for (Iterator i = symL2.iterator(); i.hasNext();)
{               
Symbol sym = (Symbol)
i.next();               
System.out.println(sym.getName() +
"#");           
}           
SymbolTokenization toke =
symL2.getAlphabet().getTokenization("token");           
for (Iterator i = symL2.iterator(); i.hasNext();)
{               
Symbol sym = (Symbol)
i.next();               
Alphabet arg =
sym.getMatches();               
for (Iterator i2 = ((FiniteAlphabet) arg).iterator(); i2.hasNext();)
{                   
Symbol sym2 = (Symbol)
i2.next();                   
System.out.println("toke:
"+toke.tokenizeSymbol(sym2));                   
System.out.println("name:
"+sym2.getName());               
}               
System.out.println("\n");           
}################################## This will print out the single letter
code: System.out.println("toke: "+toke.tokenizeSymbol(sym2)); This
will print out the three letter code:  System.out.println("name:
"+sym2.getName()); Do you think it is worthwhile to put this sample code
in the wiki?Thanks,Daniel"Mark Schreiber"
<markjschreiber at gmail.com> wrote:  > Hi - >  > Yes, this is
absolutely possible. If biojava can create an unambigous > amino acid from
an ambigous codon it will. If the possible amino acids > are a choice of 2
or more an ambiguity symbol (BasisSymbol) is created > that contains those
amino acids. >  > Note that if you turn any ambiguous amino acid into a
String then you > will just get an X so you need to decompose it into it's
underlying > AtomicSymbols. >  > See
http://biojava.org/wiki/BioJava:Cookbook:Alphabets:Ambiguous for > some
idea (except in your case you need to do the reverse). >  > This would
make another nice example for the cookbook so when you get > some demo code
working it would be good if you could put it up on the > wiki. >  > -
Mark >  > On Wed, May 7, 2008 at 6:31 PM, community at struck.lu
<community at struck.lu> > wrote: > > Hi,I am just beginning to
use biojava and I have a question concerning the > > parsing of protein
sequences containing ambiguities:Is it possible to get > all > > the
possible amino acids at each position of the protein sequence with a > >
single letter code instead of the three letter code?Suppose I would >
translate > > a DNA sequence containing an "N", so the
protein translation would > > also contain ambiguities:SymbolList symL =
> > DNATools.createDNA("atnatg");SymbolList symL2 =
> > DNATools.toProtein(symL);Iterator symIt = > >
symL2.iterator();System.out.println(symL2.seqString());OUTPUT:XMSymbol >
> hm;while (symIt.hasNext()) {    hm = (Symbol)
> > symIt.next();    >
System.out.println(hm.getName());}OUTOUT:[MET > > ILE]METWould it be
possible to ouput:MIMRegards,Daniel Struck > >
_________________________________________________________ > > Mail sent
using root eSolutions Webmailer - www.root.lu > >
_______________________________________________ > > Biojava-l mailing
list  -  Biojava-l at lists.open-bio.org > >
http://lists.open-bio.org/mailman/listinfo/biojava-l > > >
_________________________________________________________
Mail sent using root eSolutions Webmailer - www.root.lu



More information about the Biojava-l mailing list