[Biopython-dev] New Newick parser in Bio.Phylo

Eric Talevich eric.talevich at gmail.com
Mon Feb 11 02:11:54 UTC 2013


Hi Ben,

I've noticed a couple new characteristics of the Newick parser that I had
questions about.

1. There is no longer a way to tell the parser to treat internal node
labels as confidence values. Lots of files in the wild do record the
support values here, including those generated by RAxML, PhyML, FastTree
and MrBayes, so I'd like to restore this option, and perhaps make it the
default. I think the condition is:

if not (self.values_are_confidence or self.comments_are_confidence or
current_clade.is_terminal()): # parse confidence from node label

Is there an easy way to add this option to the parser? I'm trying to get
this to work in the "else" clause in parse_tree, where unquoted node labels
are handled.


2. Confidence values are required to be between 0.0 and 1.0. Also, support
values recorded as integers are treated as percentages and divided by 100
automatically. The phyloXML spec doesn't have this range requirement. RAxML
scales bootstraps to 100, but PhyML records the raw number of supporting
bootstrap runs (e.g. supports out of 1000 if there were 1000 bootstrap
replicates). So, I'd prefer to leave the confidence values as they are,
requiring only that they be numeric. Thoughts?


Thanks,
Eric



More information about the Biopython-dev mailing list