[Biopython] Piping in and out of clustal

Ivan Gregoretti ivangreg at gmail.com
Thu Mar 28 16:13:57 UTC 2013


Indeed, from the command line I pipe in and out of clustal regularly.
I do it like this:

zcat input_big.fa.gz | clustalo -i /dev/stdin -o /dev/stdout | pigz -9
> output_big.fa.gz

This suggestion of yours "stdout,stderr = clustalo_cline(stdin)" is
what I am looking for.

Would some good-hearted Biopythonian around the world share a code
snippet using this strategy? I'm curious to see how people pass one by
one their SeqRecords to properly formatted strings in memory. I'm
guessing that it is a very common task.

Anyone?

Thank you,

Ivan



Ivan Gregoretti, PhD



On Thu, Mar 28, 2013 at 11:42 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Thu, Mar 28, 2013 at 3:26 PM, Ivan Gregoretti <ivangreg at gmail.com> wrote:
>> Hello Biopythonians,
>>
>> Both Biopython's documentation and common sense agree in that we
>> should not try to code in Python algorithms that have been already
>> developed and tested outside Python. That is why Biopython offers very
>> convenient interfaces to many bioinformatics programmes like Clustal
>> Omega among them.
>>
>> In general, this is how I set Clustal up
>>
>> clustalo_cline = ClustalOmegaCommandline(infile='input.fa', outfile='output.fa')
>>
>> and this is how I execute it
>>
>> stdout,stderr = clustalo_cline()
>>
>> That is fine. Here comes the problem:
>>
>> I need to run a million times (and in multiple cores) very brief
>> clustal executions.
>> Is there a way to pass/get SeqRecords to/from Clustal without the
>> creation of input and output files?
>>
>> Or, simply put,
>>
>> Can I pipe in and out of Clustal from within Python?
>>
>> Thank you,
>>
>> Ivan
>
> Yes, according to the Clustal Omega documentation by default it
> writes its results (alignments) to stdout, and if you use '-' as the
> input filename and it will read from stdin.
> http://www.clustal.org/omega/README
>
> If the data is small, you can probably use strings in memory, e.g.
> stdout,stderr = clustalo_cline(stdin)
>
> Peter



More information about the Biopython mailing list