[Biopython] Bug in Phylo.write('phyloxml')?

Eric Talevich eric.talevich at gmail.com
Mon Dec 26 20:34:28 UTC 2011


Here's the fix:
https://github.com/biopython/biopython/commit/85c3f5d35a0ae349cc35ea94754037f5c329b9b3

The 'name' attribute for clades already defaulted to None instead of the
empty string (""), so I think it was just an oversight that the Tree 'name'
attribute defaulted to "" before this patch.

That should fix your problem, Jon. I'm not so concerned about the empty
<name /> tag crashing other tree viewers because it's valid XML, it fits
the phyloXML spec, and Archaeopteryx accepts it. This patch should keep
that troublesome attribute value from arising inadvertently, though.


Cheers & best wishes,
Eric


On Mon, Dec 19, 2011 at 7:44 AM, Jon Sanders <jsanders at oeb.harvard.edu>wrote:

> Yup, that seems to work just fine, and it's way easier and safer than
> writing a script to open the tree file and kill the tag, which is what I
> was going to do. Thanks again!
>
> -j
>
>
> On Mon, Dec 19, 2011 at 12:31 AM, Eric Talevich <eric.talevich at gmail.com>wrote:
>
>> Alrighty,
>>
>> I think the main problem in your case is that the Newick parser creates
>> trees with the 'name' attribute set to the empty string "" instead of None.
>> When converting to PhyloXML, that value stays in place and gets serialized
>> as an empty element. The author of the phyloXML spec is the author of
>> Archaeopteryx, so all of that makes sense.
>>
>> For your case, Jon, and others having this problem: before writing a tree
>> as phyloXML, set the tree name to None if it's not already named.
>>
>> if tree.name == "":
>>     tree.name = None
>> Phylo.write(tree, 'example.xml', 'newick')
>>
>>
>> For the future, I guess the best approach is to change the Newick parser
>> to set the tree name to None instead of "" by default. Any issues with that
>> solution?
>>
>> -Eric
>>
>>
>>
>> On Sat, Dec 17, 2011 at 9:08 PM, Jon Sanders <jsanders at oeb.harvard.edu>wrote:
>>
>>> Archaeopteryx is the one program that read the trees fine. I also tried
>>> HyperTree, TreeGraph2, and Treevolution, which failed.
>>>
>>> -j
>>>
>>>
>>> On Sun, Dec 18, 2011 at 12:03 AM, Eric Talevich <eric.talevich at gmail.com
>>> > wrote:
>>>
>>>> On Fri, Dec 16, 2011 at 1:44 PM, Jon Sanders <jsanders at oeb.harvard.edu>wrote:
>>>>
>>>>> My XML trees exported in biopython (with confidence values, thanks
>>>>> Eric!)
>>>>> don't open in most XML tree viewing programs.
>>>>>
>>>>> The problem seems to be a spurious <name /> tag at the beginning of the
>>>>> tree.
>>>>>
>>>>> <phy:phyloxml xmlns:phy="http://www.phyloxml.org">
>>>>>  <phy:phylogeny rooted="false">
>>>>>    <phy:name />
>>>>>    <phy:clade>
>>>>>
>>>>> If I delete this tag they open fine.
>>>>>
>>>>>
>>>> Hi Jon,
>>>>
>>>> Thanks for reporting this. I'll check the spec to see if an empty
>>>> 'name' tag is even valid. Can you give me the name of one or two programs
>>>> that are supposed to handle phyloXML, but don't like this input? Is
>>>> Archaeopteryx one of them?
>>>>
>>>> -Eric
>>>>
>>>
>>>
>>>
>>> --
>>> "If you hold a cat by the tail you learn things you cannot learn any
>>> other way."
>>>                          --Mark Twain
>>>
>>>
>>
>
>
> --
> "If you hold a cat by the tail you learn things you cannot learn any other
> way."
>                          --Mark Twain
>
>



More information about the Biopython mailing list