[Bioperl-l] automation of translation based on alignment

Ross KK Leung ross at cuhk.edu.hk
Mon Mar 22 01:22:47 UTC 2010


Dear Florent,

Sorry for mis-clicking "reply" instead of "reply-all". Here are my problem
details:

Input:

1000 multiple aligned DNA sequences
One of them has Genbank file
http://www.ncbi.nlm.nih.gov/nuccore/DQ089804.1?ordinalpos=1

the remaining 999 ones only have genomic sequences.

Objective: to derive the cognate protein aligned sequences. (here have 4
sets as there are 4 overlapping genes)

Difficulties: 
1) circular genome
2) there may be in-dels

Hope now the problem has been clarified, Ross

-----Original Message-----
From: Florent Angly [mailto:florent.angly at gmail.com] 
Sent: Monday, March 22, 2010 9:14 AM
To: Ross KK Leung; bioperl-l List
Subject: Re: [Bioperl-l] automation of translation based on alignment

Hi Ross,

Please keep relies on the BioPerl mailing list so that everyone benefits.

You should give detailed explanations of what you are tying to achieve., 
e.g.:
     * What type of input file do you have?
     * Do you already know the location of the ORFs?
     * what is the multiple alignments you are talking about
...

Florent


On 22/03/10 11:07, Ross KK Leung wrote:
> Dear Florent,
>
> Thanks for your response. While the one with Genbank file can be
extracted,
> those without have to rely on alignment. Scripts certainly can be written
to
> move forward and backward on the multiple alignment but it is an
error-prone
> process and that's why I raised this question.
>
> Rgds, Ross
>
>
>
> -----Original Message-----
> From: Florent Angly [mailto:florent.angly at gmail.com]
> Sent: Monday, March 22, 2010 8:44 AM
> To: Ross KK Leung
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] automation of translation based on alignment
>
> Hi Ross,
> It seems like your answer is in the link you put. On this link, all the
> coding sequences are already identified and their aminoacid sequence
> provided. You simply need to parse all the GenBank entries to extract
> this information. You may use EUtilities to achieve this online:
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> Florent
>
> On 21/03/10 09:55, Ross KK Leung wrote:
>    
>> Dear bioperl users,
>>
>>
>>
>> I am working on virus sequences and one of the Genbank file is here:
>>
>>
>>
>> http://www.ncbi.nlm.nih.gov/nuccore/DQ089804.1?ordinalpos=1
>>
>>      
>
<http://www.ncbi.nlm.nih.gov/nuccore/DQ089804.1?ordinalpos=1&itool=EntrezSys
>    
>> tem2.PEntrez.Sequence.Sequence_ResultsPanel.Sequence_RVDocSum>
>>
>>      
>
&itool=EntrezSystem2.PEntrez.Sequence.Sequence_ResultsPanel.Sequence_RVDocSu
>    
>> m
>>
>>
>>
>> with 1000 such nucleotide sequences, I'd like to translate the
>>      
> corresponding
>    
>> protein coding sequences. The difficulties lie in:
>>
>>
>>
>> 1)      The genome sequence is circular
>>
>> 2)      The genes are overlapping
>>
>>
>>
>> I don't have all the 1000 Genbank files but I plan to use the above guide
>> one to direct the automation process. Has bioperl implemented specialized
>> functions to handle this kind of problem?
>>
>>
>>
>> Thanks a lot for your advice, Ross
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>      
>
>
>    






More information about the Bioperl-l mailing list