[Biopython-dev] [Wg-phyloinformatics] BioGeography update

Peter biopython at maubp.freeserve.co.uk
Thu Jul 9 21:53:42 UTC 2009


On Thu, Jul 9, 2009 at 8:46 PM, Eric Talevich<eric.talevich at gmail.com> wrote:
> The proposal is to extract the Tree class hierarchy so that other modules
> can share it, and Biopython users can do I/O with trees as easily as they
> currently can with sequences ("from Bio import TreeIO; for tree in
> TreeIO.parse('example.xml', 'phyloxml'): ...").
> ...

Yes :)

> In the above case, TreeIO.py is a new file containing wrappers for the read
> and parse functions in my PhyloXML module, and also Nexus and Newick,
> pending integration. ...
>
> Alternatively, the individual modules that implement each format for I/O can
> be collected under a new TreeIO directory, with __init__ implementing the
> wrappers: ...

Either idea sounds reasonable. However, for future extensivility, and
also consistency with Bio.SeqIO and Bio.AlignIO, I would suggest we
have Bio/TreeIO/__init__.py (i.e. as a folder containing as many
wrappers or parsers as needed) rather than just using Bio/TreeIO.py
(a single file).

Note that the Nexus parser is much more than just a tree parser.
NEXUS files can contain trees, but much more besides (including a
multiple sequence alignment, and instructions to phylogenetic
tools). In the short term for TreeIO and Nexus, I would just have
Bio/TreeIO/NexusIO.py as a thin wrapper that calls Bio.Nexus and
converts its trees into the standard trees (i.e. we don't have to
make any changes to Bio.Nexus immediately). In the longer term,
it would make sense for Bio.Nexus to start using the new tree
objects - but we also have backwards compatibility to think about.

Ideally we can get Frank and/or Cymon to look at this (rather than
Nick or Eric - as this is their code, and Nick and Eric have more
than enough work to do for their projects).

[There are parallels here to how I did Bio.SeqIO (and AlignIO),
often wrapping existing parsers by turning their format specific
data structures into the common SeqRecord (or Alignment)
objects. For example, to read/write alignments in NEXUS format
Bio.AlignIO just calls Bio.Nexus internally.]

Peter



More information about the Biopython-dev mailing list