<div dir="ltr"><div dir="ltr">Dear Biopythoneers,<br><br>Biopython has a lot of command line tool wrappers, based around the objects in <span class="gmail-il">Bio</span>/<span class="gmail-il">Application</span>/__init__.py, for building a command line string and running it. Some time ago I started to think that we might actually be better off dropping our in-house command line wrappers, and recommending a standard or third party library approach for defining and executing command line strings instead.<br><br>Taking an example in our tutorial, running the blastx tool from NCBI BLAST+. Currently Biopython provides a specific object for the blastx command, which knows all the expected command arguments, can do some validation, and even has some basic help text included for each of them:<br><br><br>>>> from <span class="gmail-il">Bio</span>.Blast.Applications import NcbiblastxCommandline<br>>>> help(NcbiblastxCommandline)<br>...<br>>>> blastx_cline = NcbiblastxCommandline(query="opuntia.fasta", db="nr", evalue=0.001, outfmt=5, out="opuntia.xml")<br>>>> blastx_cline<br>NcbiblastxCommandline(cmd='blastx', out='opuntia.xml', outfmt=5, query='opuntia.fasta',<br>db='nr', evalue=0.001)<br>>>> print(blastx_cline)<br>blastx -out opuntia.xml -outfmt 5 -query opuntia.fasta -db nr -evalue 0.001<br>>>> stdout, stderr = blastx_cline()<br><br><br>This works quite nicely, but writing a unique class for each command line tool we wish to support is a lot of quiet tedious work, especially if including minimal documentation for the arguments or argument validation. This is also an on-going maintenance problem - one of the issues I think we should fix before the next Biopython release is updating the NCBI BLAST+ wrappers as new arguments have been added.<br><br>Some tools have a rather cryptic command line API, and in those cases perhaps our efforts are sensible. However, with tools like NCBI BLAST+ where is a clear command line API, and I don't see that our efforts actually add a great deal over constructing the string in code and calling subprocess:<br><br><br>>>> import subprocess<br>>>> cmd = "blastx -query opuntia.fasta -db nr -out opuntia.xml -evalue 0.001 -outfmt 5"<br>>>> subprocess.check_call(cmd, shell=True)<br><br><br>There are third party libraries which might be easier? For example, the sh library supports our our current style with keyword arguments:<br><br><br>>>> from sh import blastx<br>>>> blastx(query="opuntia.fasta", db="nr", out="opuntia.xml", evalue="0.001", outfmt="5", _long_prefix="-")<br><br><br>You can avoid repeating the extra argument due to the NCBI not following the minus-minus prefix convention, e.g.:<br><br><br>>>> import sh<br>>>> blastx = sh.blastx.bake(_long_prefix="-")<br>>>> blastx(query="opuntia.fasta", db="nr", out="opuntia.xml", evalue="0.001", outfmt="5")<br><br><br>See <a href="https://github.com/amoffat/sh" target="_blank">https://github.com/amoffat/sh</a><br><br>This is close to the same usability our wrapper offers, but with no ongoing maintenance burden. It would need more investigation (especially commands where the order is critical, often seen on macOS but not Linux), but Windows support aside it seems attractive.<br><br>If there was a cross-platform system which offered this Python-like syntax for specifying the command line arguments, that would be a tempting alternative. I don't think plumbum (latin for lead, as used for pipes in the past) does, and I find this form heavy:<br><br>>>> from blumbum import local<br>>>> cmd = local["blastx"]["-query", "opuntia.fasta", "-db", "nr", "-out", "opuntia.xml", "-evalue", "0.001", "-outfmt", "5"]<br>>>> cmd()<br>''<br><br>See <a href="https://github.com/tomerfiliba/plumbum" target="_blank">https://github.com/tomerfiliba/plumbum</a><br><br>What do people think? Do you have a favourite third party library for this kind of thing?<font color="#888888"><br><br>Peter</font><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div></div></div></div>
</blockquote></div></div>