[Biopython-dev] pypaml

Eric Talevich eric.talevich at gmail.com
Sat Jan 15 05:35:48 UTC 2011


Hi Brandon,

Thanks for volunteering! I think this will be a nice addition to Biopython
and particularly Bio.Phylo.

Some thoughts on organization:

On Fri, Jan 14, 2011 at 10:40 AM, Brad Chapman <chapmanb at 50mail.com> wrote:

>
> The functionality here looks great. My stylistic suggestion would be
> to separate the code for running the commandline from that used to
> parse the output file. Ideally these would be two separate classes
> that could live under the Bio.Phylo namespace:
>
> https://github.com/biopython/biopython/tree/master/Bio/Phylo
>

I agree.


For the commandline code, it would be nice to have a
> Bio.Phylo.Applications that is organized similar to
> Bio.Align.Applications:
>
> https://github.com/biopython/biopython/tree/master/Bio/Align/Applications
>
> This will give you some flexibility as you want to expand out to
> support other programs, and provide a framework for additional
> phylogenetic commandline utilities.
>

Since it sounds like you might eventually write wrappers for other programs
in the PAML suite, a layout like this might work:

Bio/Phylo/Applications/_codeml.py
 -- just the wrapper for running the command-line program, perhaps based on
the Bio.Application classes. The API for calling the wrapper goes through
__init__.py; the user doesn't import this module directly. (See
Bio.Align.Applications)


Bio/Phylo/PAML/codeml.py
 -- all the code for parsing the output of the command-line program, and
working with that dictionary/class. Any other modules this depends on would
also go here, as would the other code for working with the input/output of
other PAML programs.


Separating parsing from commandline generation can also let you move
> the _results dictionary from being a class member to a return value for
> a parse function. This is a bit more straightforward workflow
> instead of having the side-effect of assigning an internal class
> attribute.
>

Yes. Also, the user might have saved the output from a codeml run previously
(maybe from a shell script/pipeline), and want to parse it without
re-running codeml through a Python wrapper. Right? (Sorry if I misunderstood
your code.)

I look forward to seeing your branch on GitHub. Please let us know if you
have any problems along the way.

All the best,
Eric



More information about the Biopython-dev mailing list