[EMBOSS] using sixpack

Matthias Dodt matthias.dodt at mdc-berlin.de
Thu Dec 3 09:39:43 UTC 2009


Hello Peter!

Thank you very much, getorf is sufficient for me-

greetings

mat

Peter Rice schrieb:
> On 12/01/2009 04:17 PM, Matthias Dodt wrote:
>> Hi there!
>>
>> I have some problems using sixpack for 6-frame translation. I want to
>> convert a fasta file of contigs with sixpack. The command is:
>>
>> sixpack contigs.fa -outseq protein_sequence
>>
>> The problem is that sixpack only converts the first sequence in the
>> fasta file. How can i force it to process the whole file??
>
> Two options:
>
> One is to change the EMBOSS code to loop over each sequence.
>
> The other is to write a script that extracts each sequence in turn and
> launches sixpack.
>
> We can consider this for the next EMBOSS release. It applies to other
> applications too. In general, would users (and developers of web and
> other interfaces) be happy if more applications could read every
> sequence in a fasta file?
>
> This raises questions of how to mark up the output so that it is clear
> where each results comes from. There will always be applications where
> it is more sensible to proces sonly a single sequence.
>
> A third option (there is so often another way):
>
> getorf will find and report open reading frames in all input sequences
>
> getorf contigs.fa -outseq protein_sequence
>
> There will be differences in the output - getorf limits ORFs to 30
> nucleotides. You get the same effect in sixpack with -orfmin 10 (oops,
> sixpack counts amino acids - we will try to make them consistent in the
> next release!)
>
> You can also add -minsize 3 to the getorf command line to report all
> ORFs like sixpack does.
>
> Hope this helps,
>
> Peter Rice

-- 
------------------------------------------------
Matthias Dodt

Scientific Programmer at Bioinformatics platform AG Dieterich

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Strasse 10, 13125 Berlin, Germany

fon: +49 30 9406 4261
email: matthias.dodt at mdc-berlin.de




More information about the EMBOSS mailing list