[Biopython-dev] [Bug 2938] New: Bio.Entrez.read() returns empty string for HTML (not an error)

If given HTML instead of XML, Bio.Entrez.read() returns an empty string. I
would have expected a helpful error message.


>>> from Bio import Entrez
>>> handle = Entrez.efetch(db="pubmed", id="17206916")
>>> handle.readline()
'<html><head><title>PmFetch response</title></head><body>\n'

Try parsing this HTML as if it were XML ...

>>> handle = Entrez.efetch(db="pubmed", id="17206916")
>>> "" == Entrez.read(handle)

i.e. Entrez.read is returning an empty string.

Problem spotted based on a mailing list query, see this thread:

