[Biopython-dev] Unicode and encoding experts? Schäffer in BLAST XML

Shyam Saladi saladi at caltech.edu
Mon Jun 20 17:17:01 UTC 2016


I'm not sure if this is the issue, but could you try specifying the
encoding directly? LANG=C doesn't do anything on my machine, so I can't
reproduce the error and test this myself.

LANG=C python3.5 -c "h = open('xml_2218_blastp_002.xml', encoding='utf-8');
print(h.read(400))"

Thanks,
Shyam

On Mon, Jun 20, 2016 at 6:51 AM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Hello all,
>
> Do any of you have first hand experience of unicode encoding
> issues with XML parsing? I'm hoping for some help with this issue:
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
> 313: ordinal not in range(128)
>
> https://github.com/biopython/biopython/issues/855
>
> I believe this problem stems from how Alejandro Schäffer is
> *sometimes* written in the BLAST XML output:
>
> https://github.com/biopython/biopython/issues/855#issuecomment-226276235
>
> Thanks,
>
> Peter
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20160620/0efb5f85/attachment.html>


More information about the Biopython-dev mailing list