[Biopython-dev] pypaml

Brad Chapman chapmanb at 50mail.com
Sat Jun 11 15:59:00 UTC 2011


Brandon;

> It's been quite a while since I've updated you with my PAML progress.
> My side projects had to take a back seat to my PhD research for a
> while, so I couldn't work on it. Anyway, I finally got back to it and
> implemented some much-needed restructuring as suggested. 

Thanks very much for taking this on. The restructuring looks
fantastic.

> I've taken the suggestion to split the parsing task into
> several functions so I hope it's all a bit more readable now. I
> certainly think it is; I was hesitant at first but now that it's done
> I see how much better it is. 

Really glad the comments were helpful. It really is the hardest
thing in programming to mess with a bunch of working code for the sake
of trying to refactor it, and you've done excellent work.

I only have one more small suggestion. A number of the functions
take a results dictionary and then modify it directly, taking
advantage of the fact that it's the same object. For instance,
'parse_parameters' in _parse_baseml.py looks like:

results["parameters"] = {}
parse_parameter_list(lines, results, num_params)
parse_kappas(lines, results)
parse_rates(lines, results)
parse_freqs(lines, results)

A nice way to do this is to pass in and return the modified
dictionary, so it is clear what is happening in the function.
Ideally, this would look like:

parameters = {}
parameters = parse_parameter_list(lines, parameters, num_params)
parameters = parse_kappas(lines, parameters)
parameters = parse_rates(lines, parameters)
parameters = parse_freqs(lines, parameters)
results["parameters"] = parameters

For someone reading the code this makes it more explicit that each
of those functions modifies the 'parameters' dictionary. Otherwise
the side effects that change the results or parameters dictionary
could be missed.

For the Chi2 question, I'm 100% agreed with Peter and Eric. The pure
python version could be useful, but no sense re-writing a C version
if an external one exists in Scipy. PyCogent also has some
functionality here as well:

http://pycogent.sourceforge.net/cookbook/standard_statistical_analyses.html#chi-square

Thanks again for all your work,
Brad



More information about the Biopython-dev mailing list