[Biopython] Nexus parsing

Mon Feb 9 16:31:50 UTC 2015

On Mon, Feb 9, 2015 at 2:54 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Feb 9, 2015 at 1:37 PM, Tiago Antao <tra at popgen.net> wrote:
>> Hi,
>>
>> I am trying to parse a (heavily annotated) nexus file with Bio.Phylo.
>> The file is from a paper on science
>> http://www.sciencemag.org/content/345/6202/1369/suppl/DC1 available here
>> http://www.sciencemag.org/content/suppl/2014/08/27/science.1259657.DC1/1259657_file_s2.zip
>> and called
>> trees/ebola.raxml.tree
>>
>> I am able to parse this with DendroPy just fine, but not with Bio.Phylo
>>
>> The error that I get is:
>>
>> hdl = Phylo.read('trees/ebola.raxml.tree', 'nexus')
>>
>> /home/tra/Dropbox/soft/biopython/Bio/Nexus/Trees.pyc in
>> _get_values(self, text) 161             if nc_end == -1:
>>     162                 raise TreeError('Error in tree description:
>> Found %s without matching %s' --> 163                                 %
>> (NODECOMMENT_START, NODECOMMENT_END)) 164             nodecomment =
>> text[nc_start:nc_end + 1] 165             text = text[:nc_start] +
>> text[nc_end + 1:]
>>
>> TreeError: Error in tree description: Found [& without matching ]
>>
>>
>> Any ideas would be most appreciated, thanks.
>> Tiago
>
> That sounds like a nice reproducible test case. Can you find the
> mismatched tags in the raw data? My guess without checking is
> this is due to expected line wrapping.
>
> Maybe file this on GitHub?
>
> Peter

See https://github.com/biopython/biopython/issues/471

Regards,

Peter