[Biopython-dev] Python 3.4 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3

Peter Cock p.j.a.cock at googlemail.com
Wed Apr 30 10:02:52 UTC 2014


Hi all,

One of the 64 bit Linux buildslaves is showing a unicode problem
on Python 3.4 in test_SeqIO_SeqXML.py and test_Phylo.py -
perhaps locale related?

e.g. http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.4/builds/9/steps/shell/logs/stdio

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
797: ordinal not in range(128)

Tiago, this is on your new docker3 buildslave - can you reproduce
this error 'by hand' and look at the locale settings please?

These seem to be the "problem" files:

$ hexdump -C PhyloXML/phyloxml_examples.xml | grep " c3 "
00003320  20 5a c3 bc 72 69 63 68  3c 2f 64 65 73 63 3e 0d  | Z..rich</desc>.|

$ hexdump -C SeqXML/rna_example.xml | grep " c3 "
00000310  3c 64 65 73 63 72 69 70  74 69 6f 6e 3e c3 a5 c3  |<description>...|
00000320  85 c3 bc c3 b6 c3 96 c3  9f c3 b8 c3 a4 c2 a2 c2  |................|

These are deliberate tests of accented characters (and other non-ASCII text)
in a unicode description. Both XML files says they are using UTF-8 as the
encoding.

Thanks,

Peter



More information about the Biopython-dev mailing list