[EMBOSS] backtranseq

Peter Rice pmr at ebi.ac.uk
Thu Jul 21 14:00:30 UTC 2005


Nadeem Faruque wrote:

> While backtranseq is very clever in predicting the cDNA sequence based on peptide sequence by choosing codons according 
> to useage, would it not be very useful to have the option for it to return an answer in degenerate bases?
> 
> eg in human, the 'peptide' is simply 'M'
> backtranseq returns the most likely codon used, ie 'ATG'
> but since it could be TTG, CTG or ATG, it may be more useful for some people to return 'HTG'

Ummmm .... depends on the genetic code. In human I would expect ATG, in 
bacteria GCG is second schoice and NTG would be the possible result - but only 
for a start codon of course (just one of the complexities of backtranslating - 
I think we must avoid inventing a start codon if the protein doesn't start 
with 'M' because the numbering gets complicated).

As this would need a different input (a genetic code, rather than a codon 
usage file) I would make this a different program - not difficult to write,

Any good suggestions for a program name?

> Returning a degenerate sequence would have the advantage (for some uses) of being usable by normal DNA-savvy 
> string-based search methods when finding the peptide coding location in nucleic acid sequences rather than having to use 
> similarity searches.  I could also see it being useful for designing PCR primers within coding regions.

... which leads on to whether EMBOSS should include such programs :-)

regards,

Peter Rice




More information about the EMBOSS mailing list