[Biopython] [eFetch] doesn't work with NLMcatalog
c.buhtz at posteo.jp
c.buhtz at posteo.jp
Mon Dec 7 18:14:14 UTC 2015
There is a problem while parsing the XML-stuff. Don't know why and
don't know how I could do more diagnosis on this problem (e.g. the
handle).
Please see at the end the URL. Using it directly in browser give a nice
result.
>>> h = Entrez.efetch(db='nlmcatalog', id='7508686', retmode='xml')
>>> r = Entrez.read(h)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/__init__.py",
line 421, in read record = handler.read(handle)
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/Parser.py",
line 215, in read self.parser.ParseFile(handle)
File "../Modules/pyexpat.c", line 405, in StartElement
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/Parser.py",
line 350, in startElementHandler raise ValidationError(name)
Bio.Entrez.Parser.ValidationError: Failed to find tag
'NLMCatalogRecordSet' in the DTD. To skip all tags that are not
represented in the DTD, please call Bio.Entrez.read or Bio.Entrez.parse
with validate=False.
>>> r = Entrez.read(h, validate=False)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/Parser.py",
line 215, in read self.parser.ParseFile(handle)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1,
column 5
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/__init__.py",
line 421, in read record = handler.read(handle)
File "/usr/local/lib/python3.4/dist-packages/Bio/Entrez/Parser.py",
line 225, in read raise NotXMLError(e)
Bio.Entrez.Parser.NotXMLError: Failed to parse the XML data (not
well-formed (invalid token): line 1, column 5). Please make sure that
the input data are in XML format.
>>> h.url
'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?email=x&tool=biopython&id=7508686&db=nlmcatalog&retmode=xml'
If this is a bug or something that would take its time to fix I am
asking if there is a fast workaround?
--
GnuPGP-Key ID 0751A8EC
More information about the Biopython
mailing list