[Bioperl-l] protal2dna and Bio::SimpleAlign
Jason Stajich
jason.stajich at duke.edu
Mon Jan 24 11:08:50 EST 2005
cool - I assume you know you can change the translation table used when
you call the 'translate' function in bioperl.
So if you start the whole thing from a set of CDS sequences, you
shouldn't have to do much messing around. The aa_to_dna_aln doesn't do
any fancy checking to insure that your codon actually can translate
into the protein you specified. That might be a good sanity check to
put in.
Title : translate
Usage : $protein_seq_obj = $dna_seq_obj->translate
#if full CDS expected:
$protein_seq_obj =
$cds_seq_obj->translate(undef,undef,undef,undef,1);
Function:
Provides the translation of the DNA sequence using full
IUPAC ambiguities in DNA/RNA and amino acid codes.
The full CDS translation is identical to EMBL/TREMBL
database translation. Note that the trailing terminator
character is removed before returning the translation
object.
Note: if you set $dna_seq_obj->verbose(1) you will get a
warning if the first codon is not a valid initiator.
Added way of translating using a custom codon table. This
has to be the final addition to this overloaded interface!
Returns : A Bio::PrimarySeqI implementing object
Args : character for terminator (optional) defaults to '*'
character for unknown amino acid (optional) defaults to 'X'
frame (optional) valid values 0, 1, 2, defaults to 0
codon table id (optional) defaults to 1
complete coding sequence expected, defaults to 0 (false)
boolean, throw exception if not complete CDS (true) or
defaults to warning (false)
codontable, a custom Bio::Tools::CodonTable object, optional
-jason
On Jan 24, 2005, at 11:01 AM, Maureen L Coleman wrote:
> Thanks for the responses. The problem (with both protal2dna and
> tranalign), as Catherine recognized, is that even when I specify
> Bacterial translation, it doesn't recognize my alternative start
> codons (gtg,ctg,ttg can all be Met).
>
> As the quickest route, I went through and changed all my alternative
> start codons in the alignments to their "normal" translation. Then
> protal2dna and tranalign seem to work fine. aa_to_dna_aln should work
> for me too, since I already have the coding DNA sequences pulled out.
>
> thanks again,
> maureen
>
> On Monday, January 24, 2005, at 10:41 AM, Jason Stajich wrote:
>
>>
>> On Jan 24, 2005, at 10:28 AM, Catherine Letondal wrote:
>>
>>>
>>> On Jan 23, 2005, at 3:19 PM, Jason Stajich wrote:
>>>
>>>> I'm not familiar with the script.
>>>
>>> Web:
>>> http://bioweb.pasteur.fr/seqanal/interfaces/protal2dna.html
>>> Man:
>>> http://bioweb.pasteur.fr/docs/man/man/protal2dna.1.html
>>> Ftp:
>>> ftp://ftp.pasteur.fr/pub/GenSoft/unix/alignment/protal2dna
>>>
>>>>
>>>> Bio::Align::Utilities does protein to DNA mapping for an alignment
>>>> with the aa_to_dna_aln function.
>>>
>>> The problem with this function aa_to_dna_aln is that is restricted
>>> to frame 1 and to the standard genetic code, right?
>>> aa_to_dna_aln
>>>
>> This is an alignment mapper routine not an alignment routine itsself.
>> So I think I was just being stupid and not looking at what
>> protal2dna really was doing.
>>
>> You provide it the protein multiple sequence alignment alignment and
>> the coding sequence which gave rise to it. It maps the gaps back in
>> so you have a CDS alignment. Very basic iterating through the
>> alignment.
>>
>> So it has to all be in-frame and already spliced, it should have been
>> called aa_to_cds_aln.
>>
>> The method is intended for getting ready to do Ka/Ks type stuff so
>> that you have aligned the sequences on codon boundaries and with
>> knowledge about conservative aa replacements.
>>
>> apologies for inciting confusion...
>> -j
>>
>>> Title : aa_to_dna_aln
>>> Usage : my $dnaaln = aa_to_dna_aln($aa_aln, \%seqs);
>>> Function: Will convert an AA alignment to DNA space given the
>>> corresponding DNA sequences. Note that this
>>> method expects
>>> the DNA sequences to be in frame +1 (GFF frame 0)
>>> as it will
>>> start to project into coordinates starting at the
>>> first base of
>>> the DNA sequence, if this alignment represents a
>>> different
>>> frame for the cDNA you will need to edit the DNA
>>> sequences
>>> to remove the 1st or 2nd bases (and revcom if
>>> things should be).
>>> Returns : Bio::Align::AlignI object
>>> Args : 2 arguments, the alignment and a hashref.
>>> Alignment is a Bio::Align::AlignI of amino acid
>>> sequences.
>>> The hash reference should have keys which are
>>> the display_ids for the aa
>>> sequences in the alignment and the values are a
>>> Bio::PrimarySeqI object for the corresponding
>>> spliced cDNA sequence.
>>>
>>>
>>> The other problem when using tools offering several genetic code
>>> (these sequences need a bacterial genetic code), is that the start
>>> codon of this code is not the right one. These sequences need: GTG=M
>>> (and not V).
>>>
>>>>
>>>> -jason
>>>> On Jan 22, 2005, at 4:07 PM, Maureen L Coleman wrote:
>>>>
>>>>> Hi.
>>>>> I'm trying to use the protal2dna script (downloaded from Pasteur
>>>>> site) to convert protein alignments back to DNA alignments. It
>>>>> works in some cases but not in others. In the cases where it
>>>>> doesn't work, it pulls out the same sequence twice instead of
>>>>> pulling out seq1 and seq2 from my protein alignment. Then when it
>>>>> tries to match it up with the corresponding DNA sequence, it
>>>>> doesn't work - it matches prot1 with dna1 (correctly) and prot1
>>>>> with dna2 (incorrectly).
>>>>>
>>>>> I suspect this might be related to the name,start,end (nse) method
>>>>> in Bio::SimpleAlign. Any suggestions?
>>>>>
>>>>> Thanks,
>>>>> Maureen
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> --
>>>> Jason Stajich
>>>> jason.stajich at duke.edu
>>>> http://www.duke.edu/~jes12/
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Jason Stajich
>> jason.stajich at duke.edu
>> http://www.duke.edu/~jes12/
>>
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list