[Biojava-l] Short names for Amino acid symbols

Peter Robinson peter.robinson at t-online.de
Sun Jul 27 15:57:28 UTC 2008


Hi,

thanks to all on the list who helped me get started with Biojava, and by 
the way, the online documents are quite helpful!

I am trying to develop some code to look for signs of positive selection 
in human sequences by making multiple alignments of protein sequences 
and mapping the nucleotide sequences onto this alignment and checking 
synonymous and nonsynonymous nucleotide substitutions in several species 
(etc).

A few small questions;
1) I have written a class to encapsulate all I need from a given Genbank 
mRNA sequence; the entire mRNA, the CDS and the corresponding protein 
sequence. I have some methods such as the following:

 private void setCDSSequence() {
        Feature CDS = getCDSFeature(this.completeSequence);
        Location loc = CDS.getLocation();
        SymbolList symL = this.completeSequence.subList(loc.getMin(), 
loc.getMax()-3); //-3 to remove stop codon
        this.CDS= symL;
    }

Question: Why is there (seemingly) no way in Biojava to create a 
Sequence object instead of a SymbolList object? Or did I miss something?

2)  I would then like to printout the protein alignment to check for 
correctness, and it seems there is no way of getting from a symbol to 
the one-letter aminoacid code. That is,

proteinAlignment.get(j).symbolAt(k).getName()

will return "Ala" instead of "A" etc. Is there a good way of getting the 
short symbols?

Thanks, Peter



More information about the Biojava-l mailing list