[Bioperl-l] Frame translation gets an extra aa?
Karger, Amir
akarger at CGR.Harvard.edu
Sun Jan 16 07:00:15 UTC 2011
Wait, what? Aaron, I'm not a biologist, so please give me a couple more sentences here.
Also, the docs (and code) don't seem to support your numbers. From http://www.bioperl.org/wiki/BioPerl_Tutorial:
You can also determine the frame of the translation. The default frame starts at the first nucleotide (frame 0). To get translation in the next frame we would write:
$prot_obj = $my_seq_object->translate(-frame => 1);
>From http://doc.bioperl.org/releases/bioperl-1.6.1/ PrimarySeqI documentation (and my 1.5 perldoc Bio::PrimarySeqI):
Args:...
-frame - frame default is 0
>From the code linked to at the doc.bioperl link above:
## use frame, error if frame is not 0, 1 or 2
$self->throw("Valid values for frame are 0, 1, or 2, not $frame.")
unless ($frame == 0 or $frame == 1 or $frame == 2);
$seq = substr($seq,$frame);
What am I missing here? All the docs I see seem to use frame as "the number of bp we move to the right before we start translating codons 3 bp at a time". But if that code is being run when I do a translate() I should really be getting the answer I expect, and not four aas. And yet the Deobfuscator tells me that Bio::Seq::translate is inheriting from PrimarySeqI. And I get the same four-aa result if I create a PrimarySeq instead of a Seq.
Aha. Now I see that PrimarySeq::translate calls CodonTable::translate after taking the substr. CodonTable::translate() says:
if the codon is two nucleotides long and if by adding
an [sic] a third character 'N', it codes for a single amino
acid (with exceptions above), return that, otherwise
return empty string.
Are you sure that's what every user of PrimarySeq::translate wants? If so, please put something in the docs about it. Also, is there an option that will let me say "translate 11 bp to only 3 aa"? From looking at the code, it looks like no. I guess I can do this on my own if frame is 1.
Slightly less confused,
-Amir
________________________________________
From: ajmackey at gmail.com [ajmackey at gmail.com] On Behalf Of Aaron Mackey [amackey at virginia.edu]
Sent: Saturday, January 15, 2011 18:34
To: Chris Fields
Cc: Karger, Amir; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Frame translation gets an extra aa?
I'm guessing the confusion might be the differences in terminology between reading frame (taking a value of 1, 2 or 3) and leading intron phase (a value of 0, 1 or 2, which corresponds to a reading frame of 1, 3 or 2, respectively) ... ?
-Aaron
On Fri, Jan 14, 2011 at 1:25 PM, Chris Fields <cjfields at illinois.edu<mailto:cjfields at illinois.edu>> wrote:
Amir,
Um, the sequence you have has 4 codons:
AAA CCC TTT GGG
Taking the final 'G' gives the correct response:
perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGG"); print $x->translate(-frame=>1)->seq'
NPL
chris
On Jan 14, 2011, at 12:06 PM, Karger, Amir wrote:
> Apologies if this question has been asked before, or if it's so stupid that nobody was silly enough to ask it before.
>
> (Using Bioperl 1.6.1)
>
> perl -l -MBio::Seq -e '$x=Bio::Seq->new(-display_id=>"foo",-seq=>"AAACCCTTTGGG"); print $x->translate(-frame=>1)->seq'
> NPLG
>
> Um, why is GG being translated to G? Shouldn't you not translate if you only have 2 bp left? That is, even if you know that GGX translates to amino acid G for X in (A,C,G,T) you don't actually have that third bp right now. In real life, would an mRNA get translated even if it's missing the third base pair?
More information about the Bioperl-l
mailing list