[Biopython] Piping in and out of clustal
Ivan Gregoretti
ivangreg at gmail.com
Thu Mar 28 16:13:57 UTC 2013
Indeed, from the command line I pipe in and out of clustal regularly.
I do it like this:
zcat input_big.fa.gz | clustalo -i /dev/stdin -o /dev/stdout | pigz -9
> output_big.fa.gz
This suggestion of yours "stdout,stderr = clustalo_cline(stdin)" is
what I am looking for.
Would some good-hearted Biopythonian around the world share a code
snippet using this strategy? I'm curious to see how people pass one by
one their SeqRecords to properly formatted strings in memory. I'm
guessing that it is a very common task.
Anyone?
Thank you,
Ivan
Ivan Gregoretti, PhD
On Thu, Mar 28, 2013 at 11:42 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Thu, Mar 28, 2013 at 3:26 PM, Ivan Gregoretti <ivangreg at gmail.com> wrote:
>> Hello Biopythonians,
>>
>> Both Biopython's documentation and common sense agree in that we
>> should not try to code in Python algorithms that have been already
>> developed and tested outside Python. That is why Biopython offers very
>> convenient interfaces to many bioinformatics programmes like Clustal
>> Omega among them.
>>
>> In general, this is how I set Clustal up
>>
>> clustalo_cline = ClustalOmegaCommandline(infile='input.fa', outfile='output.fa')
>>
>> and this is how I execute it
>>
>> stdout,stderr = clustalo_cline()
>>
>> That is fine. Here comes the problem:
>>
>> I need to run a million times (and in multiple cores) very brief
>> clustal executions.
>> Is there a way to pass/get SeqRecords to/from Clustal without the
>> creation of input and output files?
>>
>> Or, simply put,
>>
>> Can I pipe in and out of Clustal from within Python?
>>
>> Thank you,
>>
>> Ivan
>
> Yes, according to the Clustal Omega documentation by default it
> writes its results (alignments) to stdout, and if you use '-' as the
> input filename and it will read from stdin.
> http://www.clustal.org/omega/README
>
> If the data is small, you can probably use strings in memory, e.g.
> stdout,stderr = clustalo_cline(stdin)
>
> Peter
More information about the Biopython
mailing list