[Bioperl-l] Frame translation gets an extra aa?

Aaron Mackey amackey at virginia.edu
Mon Jan 17 14:10:37 UTC 2011


I did say that I was guessing ...

the fact that -frame ranges between 0 and 2 makes sense to a programmer, but
not to so much to a biologist who has a numerical understanding of the
concept "reading frame"; I was imagining that the BioPerl API had used the
more "natural" frame range of 1..3.

sorry for muddying the waters,

-Aaron

On Sun, Jan 16, 2011 at 2:00 AM, Karger, Amir <akarger at cgr.harvard.edu>wrote:

> Wait, what? Aaron, I'm not a biologist, so please give me a couple more
> sentences here.
>
> Also, the docs (and code) don't seem to support your numbers. From
> http://www.bioperl.org/wiki/BioPerl_Tutorial:
>
>    You can also determine the frame of the translation. The default frame
> starts at the first nucleotide (frame 0). To get translation in the next
> frame we would write:
>    $prot_obj = $my_seq_object->translate(-frame => 1);
>
> From http://doc.bioperl.org/releases/bioperl-1.6.1/ PrimarySeqI
> documentation (and my 1.5 perldoc Bio::PrimarySeqI):
>    Args:...
>    -frame         - frame                           default is 0
>
> From the code linked to at the doc.bioperl link above:
>
>         ## use frame, error if frame is not 0, 1 or 2
>                 $self->throw("Valid values for frame are 0, 1, or 2, not
> $frame.")
>                        unless ($frame == 0 or $frame == 1 or $frame == 2);
>                 $seq = substr($seq,$frame);
>
> What am I missing here? All the docs I see seem to use frame as "the number
> of bp we move to the right before we start translating codons 3 bp at a
> time". But if that code is being run when I do a translate() I should really
> be getting the answer I expect, and not four aas. And yet the Deobfuscator
> tells me that Bio::Seq::translate is inheriting from PrimarySeqI. And I get
> the same four-aa result if I create a PrimarySeq instead of a Seq.
>
> Aha. Now I see that PrimarySeq::translate calls CodonTable::translate after
> taking the substr. CodonTable::translate() says:
>
>  if the codon is two nucleotides long and if by adding
>               an [sic] a third character 'N', it codes for a single amino
>               acid (with exceptions above), return that, otherwise
>               return empty string.
>
> Are you sure that's what every user of PrimarySeq::translate wants? If so,
> please put something in the docs about it. Also, is there an option that
> will let me say "translate 11 bp to only 3 aa"? From looking at the code, it
> looks like no. I guess I can do this on my own if frame is 1.
>
> Slightly less confused,
>
> -Amir
>
> ________________________________________
> From: ajmackey at gmail.com [ajmackey at gmail.com] On Behalf Of Aaron Mackey [
> amackey at virginia.edu]
> Sent: Saturday, January 15, 2011 18:34
> To: Chris Fields
> Cc: Karger, Amir; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Frame translation gets an extra aa?
>
> I'm guessing the confusion might be the differences in terminology between
> reading frame (taking a value of 1, 2 or 3) and leading intron phase (a
> value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2,
> respectively) ... ?
>
> -Aaron
>
> On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields <cjfields at illinois.edu
> <mailto:cjfields at illinois.edu>> wrote:
> Amir,
>
> Um, the sequence you have has 4 codons:
>
> AAA CCC TTT GGG
>
> Taking the final 'G' gives the correct response:
>
> perl -l -MBio::Seq -e
> '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print
> $x->translate(-frame=>1)->seq'
> NPL
>
> chris
>
> On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote:
>
> > Apologies if this question has been asked before, or if it's so stupid
> that nobody was silly enough to ask it before.
> >
> > (Using Bioperl 1.6.1)
> >
> > perl -l -MBio::Seq -e
> '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print
> $x->translate(-frame=>1)->seq'
> > NPLG
> >
> > Um, why is GG being translated to G? Shouldn't you not translate if you
> only have 2 bp left? That is, even if you know that GGX translates to amino
> acid G for X in (A,C,G,T) you don't actually have that third bp right now.
> In real life, would an mRNA get translated even if it's missing the third
> base pair?
>



More information about the Bioperl-l mailing list