[Biojava-l] amino acid to nucleic acid alignment

David Huen smh1008 at cam.ac.uk
Wed Jan 4 15:52:19 EST 2006


On Jan 4 2006, Alex Golubev wrote:

>Hi,
>
> I'm trying to align amino acids to nucleic acids. I'm using gapped 
> sequences both for the protein and for the DNA. I have several problems 
> and I would very appreciate if someone could help. 1. How can I parse DNA 
> nucleic acids and get codons. I would like to start with DNA that look 
> like this "ATGTAT" and get a protein that look like this "MY". I'm using 
> "Alphabet alpha = DNATools.getCodonAlphabet();" but I can't find 
> tokenization to parse the DNA string (does this make any sense?). 

You can convert a SymbolList in the DNA alphabet into the equivalent symbol 
list in the codon alphabet (DNAxDNAxDNA) by using 
SymbolListViews.orderNSymbolList(...).



> 2. My 
> other problem is that there are frame shifts and my gapped DNA look 
> actually like this "AT-G-TAT". Is there any way to get/translate 
> locations from the codon symbols list to/from the DNA symbols list?
>
Ouch.  What do you really want to do here?

Regards,
David Huen


More information about the Biojava-l mailing list