[Bioperl-l] protal2dna and Bio::SimpleAlign

Mon Jan 24 11:08:50 EST 2005

cool - I assume you know you can change the translation table used when 
you call the 'translate' function in bioperl.

So if you start the whole thing from a set of CDS sequences, you 
shouldn't have to do much messing around.  The aa_to_dna_aln doesn't do 
any fancy checking to insure that your codon actually can translate 
into the protein you specified.  That might be a good sanity check to 
put in.

Title   : translate
  Usage   : $protein_seq_obj = $dna_seq_obj->translate
            #if full CDS expected:
            $protein_seq_obj = 
$cds_seq_obj->translate(undef,undef,undef,undef,1);
  Function:

            Provides the translation of the DNA sequence using full
            IUPAC ambiguities in DNA/RNA and amino acid codes.

            The full CDS translation is identical to EMBL/TREMBL
            database translation. Note that the trailing terminator
            character is removed before returning the translation
            object.

            Note: if you set $dna_seq_obj->verbose(1) you will get a
            warning if the first codon is not a valid initiator.

            Added way of translating using a custom codon table.  This
            has to be the final addition to this overloaded interface!

  Returns : A Bio::PrimarySeqI implementing object
  Args    : character for terminator (optional) defaults to '*'
            character for unknown amino acid (optional) defaults to 'X'
            frame (optional) valid values 0, 1, 2, defaults to 0
            codon table id (optional) defaults to 1
            complete coding sequence expected, defaults to 0 (false)
            boolean, throw exception if not complete CDS (true) or 
defaults to warning (false)
            codontable, a custom Bio::Tools::CodonTable object, optional
-jason

On Jan 24, 2005, at 11:01 AM, Maureen L Coleman wrote:

> Thanks for the responses.  The problem (with both protal2dna and 
> tranalign), as Catherine recognized, is that even when I specify 
> Bacterial translation, it doesn't recognize my alternative start 
> codons (gtg,ctg,ttg can all be Met).
>
> As the quickest route, I went through and changed all my alternative 
> start codons in the alignments to their "normal" translation.  Then 
> protal2dna and tranalign seem to work fine.  aa_to_dna_aln should work 
> for me too, since I already have the coding DNA sequences pulled out.
>
> thanks again,
> maureen
>
> On Monday, January 24, 2005, at 10:41  AM, Jason Stajich wrote:
>
>>
>> On Jan 24, 2005, at 10:28 AM, Catherine Letondal wrote:
>>
>>>
>>> On Jan 23, 2005, at 3:19 PM, Jason Stajich wrote:
>>>
>>>> I'm not familiar with the script.
>>>
>>> Web:
>>> http://bioweb.pasteur.fr/seqanal/interfaces/protal2dna.html
>>> Man:
>>> http://bioweb.pasteur.fr/docs/man/man/protal2dna.1.html
>>> Ftp:
>>> ftp://ftp.pasteur.fr/pub/GenSoft/unix/alignment/protal2dna
>>>
>>>>
>>>> Bio::Align::Utilities does protein to DNA mapping for an alignment 
>>>> with the aa_to_dna_aln function.
>>>
>>> The problem with this function aa_to_dna_aln is that  is restricted 
>>> to frame 1 and to the standard genetic code, right?
>>>        aa_to_dna_aln
>>>
>> This is an alignment mapper routine not an alignment routine itsself. 
>>  So I think I was just being stupid and not looking at what 
>> protal2dna really was doing.
>>
>> You provide it the protein multiple sequence alignment alignment and 
>> the coding sequence which gave rise to it.  It maps the gaps back in 
>> so you have a CDS alignment.  Very basic iterating through the 
>> alignment.
>>
>> So it has to all be in-frame and already spliced, it should have been 
>> called aa_to_cds_aln.
>>
>> The method is intended for getting ready to do Ka/Ks type stuff so 
>> that you have aligned  the sequences on codon boundaries and with 
>> knowledge about conservative aa replacements.
>>
>> apologies for inciting confusion...
>> -j
>>
>>>         Title   : aa_to_dna_aln
>>>         Usage   : my $dnaaln = aa_to_dna_aln($aa_aln, \%seqs);
>>>         Function: Will convert an AA alignment to DNA space given the
>>>                   corresponding DNA sequences.  Note that this 
>>> method expects
>>>                   the DNA sequences to be in frame +1 (GFF frame 0) 
>>> as it will
>>>                   start to project into coordinates starting at the 
>>> first base of
>>>                   the DNA sequence, if this alignment represents a 
>>> different
>>>                   frame for the cDNA you will need to edit the DNA 
>>> sequences
>>>                   to remove the 1st or 2nd bases (and revcom if 
>>> things should be).
>>>         Returns : Bio::Align::AlignI object
>>>         Args    : 2 arguments, the alignment and a hashref.
>>>                   Alignment is a Bio::Align::AlignI of amino acid 
>>> sequences.
>>>                   The hash reference should have keys which are
>>>                   the display_ids for the aa
>>>                   sequences in the alignment and the values are a
>>>                   Bio::PrimarySeqI object for the corresponding
>>>                   spliced cDNA sequence.
>>>
>>>
>>> The other problem when using tools offering several genetic code 
>>> (these sequences need a bacterial genetic code), is that the start 
>>> codon of this code is not the right one. These sequences need: GTG=M 
>>> (and not V).
>>>
>>>>
>>>> -jason
>>>> On Jan 22, 2005, at 4:07 PM, Maureen L Coleman wrote:
>>>>
>>>>> Hi.
>>>>> I'm trying to use the protal2dna script (downloaded from Pasteur 
>>>>> site) to convert protein alignments back to DNA alignments. It 
>>>>> works in some cases but not in others.  In the cases where it 
>>>>> doesn't work, it pulls out the same sequence twice instead of 
>>>>> pulling out seq1 and seq2 from my protein alignment.  Then when it 
>>>>> tries to match it up with the corresponding DNA sequence, it 
>>>>> doesn't work - it matches prot1 with dna1 (correctly) and prot1 
>>>>> with dna2 (incorrectly).
>>>>>
>>>>> I suspect this might be related to the name,start,end (nse) method 
>>>>> in Bio::SimpleAlign.  Any suggestions?
>>>>>
>>>>> Thanks,
>>>>> Maureen
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> --
>>>> Jason Stajich
>>>> jason.stajich at duke.edu
>>>> http://www.duke.edu/~jes12/
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Jason Stajich
>> jason.stajich at duke.edu
>> http://www.duke.edu/~jes12/
>>
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/