[Bioperl-l] Translating alternate start codons

Tue Nov 21 16:16:01 UTC 2006

> From: Brian Osborne [mailto:bosborne11 at verizon.net]
>
> Amir,
> 
> The best documentation for translate() is in the online 
> Bioperl Tutorial,
> have you checked that?
> 
> Brian O.

Thanks for the quick response. The tutorial is quite informative.
It seems to me that the POD needs to document -complete more thoroughly,
though:

                  Or if you expect a complete coding sequence (CDS)
translation,
                  with inititator at the beginning and terminator at the
end:

                  $protein_seq_obj = $cds_seq_obj->translate(-complete
=> 1);

This doesn't really explain what it does.

I guess -complete was chosen as a compromise between having too many
options and having lots of functionality. In my case, I want to keep the
*, and I don't want warnings about terminators in the middle, because
I've got a bunch of pseudogenes. So I'll just translate the M myself.

I'm sure you've had many "the documentation is spread out in too many
places" discussions before, and I know keeping docs up to date is Hard.
Oh well.

-Amir

> 
> 
> On 11/21/06 10:21 AM, "Amir Karger" <akarger at CGR.Harvard.edu> wrote:
> 
> > I think this is more a Bio question than a Bioperl question.
> > 
> > I did this:
> > 
> > #########
> > #!/usr/local/bin/perl
> > 
> > use strict;
> > use warnings;
> > 
> > use Bio::Seq;
> > use Bio::Tools::CodonTable;
> > 
> > my $seqobj = Bio::PrimarySeq->new (
> >     -seq => 'ATATGATAA',
> >     -id  => 'GeneFragment-12',
> >     -accession_number => 'X78121',
> >     -alphabet => 'dna',
> > );
> > 
> > $myCodonTable2  = Bio::Tools::CodonTable -> new ( -id => 4 );
> > my $is = $myCodonTable->is_start_codon('ATA') ? "is" : "is not";
> > print "ATA $is a valid start codon\n";
> > print "Table 4: ", $seqobj->translate("-codontable_id" => 
> 4)->seq,"\n";
> > print "Table 1: ", $seqobj->translate("-codontable_id" => 
> 1)->seq,"\n";
> > ###########
> > 
> > I got this:
> > ATA is a valid start codon
> > Table 4: IW*
> > Table 1: I**
> > 
> > But EMBL tells me that EMBLCDS:AAT64955 starts with an M:
> > 
> http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-id+3b6PL1TmQt3+-e+[
> EMBLCDS:'A
> > AT64955']+-qnum+1+-enum+3
> > 
> > So, does Bioperl purposely not translate start codons to M, 
> while EMBL
> > does? Am I supposed to just change the I to M explicitly in 
> my code? I
> > didn't see an obvious option to translate() to do it.
> > 
> > Thanks,
> > 
> > - Amir Karger
> > Research Computing
> > Life Sciences Division
> > Harvard University
> > 617-496-0626
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
>