[Biopython-dev] Fwd: [biopython] Newick parser (#156)

Peter Cock p.j.a.cock at googlemail.com
Fri Feb 8 15:21:46 UTC 2013


Eric,

Could you take a look at this please?

Thanks,

Peter

---------- Forwarded message ----------
From: Ben Morris <notifications at github.com>
Date: Fri, Feb 8, 2013 at 3:12 PM
Subject: [biopython] Newick parser (#156)
To: biopython/biopython <biopython at noreply.github.com>


In light of three issues with the Newick parser:

https://redmine.open-bio.org/issues/3409
https://redmine.open-bio.org/issues/3386
https://redmine.open-bio.org/issues/3407

this is a rewrite of the parser from scratch. It supports quoted node
labels and can handle support values either as they were previously handled
or from square-bracketed comments, as requested by Arlin. Additionally,
it's consistently quite fast:

[image: newick_parse_times]<https://f.cloud.github.com/assets/544977/139616/fac0df38-71fe-11e2-91a8-a95ba7c6340b.png>

The unit tests still pass with these changes, and I'm now able to parse
trees that previously raised exceptions.
------------------------------
You can merge this Pull Request by running

  git pull https://github.com/bendmorris/biopython newick

Or view, comment on, or merge it at:

  https://github.com/biopython/biopython/pull/156
Commit Summary

   - A more efficient implementation of a Newick parser (linear time vs.
   quadratic) that makes only a single pass over the text and handles quoted
   labels correctly.
   - Implementing support values and fixing issue when external parentheses
   are missing.

File Changes

   - *M* Bio/Phylo/NewickIO.py<https://github.com/biopython/biopython/pull/156/files#diff-0>(198)

Patch Links:

   - https://github.com/biopython/biopython/pull/156.patch
   - https://github.com/biopython/biopython/pull/156.diff



More information about the Biopython-dev mailing list