[Biopython-dev] pypaml

Peter Cock p.j.a.cock at googlemail.com
Thu Aug 11 13:49:39 UTC 2011


On Thu, Aug 11, 2011 at 12:51 PM, Brandon Invergo <b.invergo at gmail.com> wrote:
> On Thu, 2011-08-11 at 12:36 +0100, Peter Cock wrote:
>> It's a shame you don't still have access to the Windows 7 box.
>>
>> I've just grabbed the current PAML 4.4 pre-compiled for Windows
>> and put it on my Windows machine which runs as a buildslave,
>> and put the binaries on the PATH:
>>
>> http://abacus.gene.ucl.ac.uk/software/paml.html
>> http://abacus.gene.ucl.ac.uk/software/paml4.4e.tar.gz
>>
>> None of the current unit tests actually use the binaries do they?
>> Could you add a basic test (in a separate file which raises the
>> missing dependency exception to skip the test if the binary is
>> not on the path) for calling the tools?
>>
>> Peter
>
> No, I didn't include any tests that use the binaries because I wasn't
> sure if they would be on the main test machine. Also, generating the
> output which is used in other tests can take a lot of time in some
> cases. Instead, I've generated the output files myself and then accessed
> those from the tests. The one problem I have with this approach is that
> it's not very reproducible; if someone else wishes to add data files
> from later versions of PAML, they won't know how I generated them.

Next time there is a PAML release, you'll have to make some more
test files ;)

> Again
> the goal is to make sure that we're parsing each new version correctly,
> since the output format has been known to change between versions. I
> could create a readme file which contains the info and put it in the
> paml Tests subfolder. Sound reasonable?

Yes.

> I can create a Tests/test_PAML.py file to contain the proposed test. In
> it, I can try to run codeml, baseml and yn00 directly using Subprocess,
> each on some bogus input. If the binaries are there, they'll throw an
> error which the test will catch. If they aren't Subprocess itself will
> throw an error. I can't do this check using Bio.Phylo.PAML because we,
> of course, aim to prevent bogus input from ever even reaching the
> binary. How does that sound? Is that what you had in mind?

I believe we're thinking on the same lines here - have a look at
test_Muscle_tool.py or test_Emboss.py and others like it. There is
some header code which tries to locate the binaries, and perhaps
check their version.

Some tools have a switch like -v or --help or similar which makes
them immediately exit, sometimes with a version number. This
is less trouble than trying to run them with a dummy input file.
Having had a quick play with ds.exe it generally seems to insist
on asking for an input file, so you may have to go that route. But
see if this is useful - probably you'd need /dev/nul on Unix machines:

C:\repositories\biopython\Tests>ds nul
results go into out.txt

(1) collecting min, max, and mean       0:00
(2) variance-covariance matrix      0:00
(3) median, percentiles & serial correlation       0:00
(4) Histograms and 1-D densities


If the binaries are missing or the wrong version, we raise
MissingExternalDependencyError and the test gets skipped.

If the binaries are present (and the right version), use the normal
unittest framework. Try to make the examples quick to run (aim
for well under a minute for the whole test), so smaller datafiles
than might be typical.

Peter



More information about the Biopython-dev mailing list