[Bioperl-l] codon tables, finding ORFs

Robert Buels rmb32 at cornell.edu
Fri May 21 17:44:26 UTC 2010


Hi all,

Right now, Bio::Tools::CodonTable uses as its 'standard' table the NCBI 
one, described at 
http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG1.

This table recognizes three different start codons: the usual ATG, plus 
TTG and CTG (which I'd never heard of before looking there, seems they 
are rare).

The issue is, if you use this codon scheme to find open reading frames 
in nucleotide sequences, you get some ORFs that I think a lot of 
biologists would be surprised at, from these two (rare?) start codons.

Seems to me, this might be a problem.  I mean, a naive user (which just 
about everyone is!) would expect the default codon table to only 
recognize the canonical ATG as a start, right?  And would be rather 
displeased if BioPerl said (by default) that something starting with one 
of these rare codons was an open reading frame?

So I guess my question is, do we think BioPerl (Bio::Tools::CodonTable) 
should really recognize these rare start codons by default?

Rob




More information about the Bioperl-l mailing list