[Biopython] Bio.Entrez/Medline DTD problems - missing DTD nlmmedlinecitationset_100301.dtd
Guy Eakin
guyeakin at gmail.com
Thu Jul 8 11:28:17 UTC 2010
Peter,
Many thanks.
this is a query statement that generated the
nlmmedlinecitationset_100301.dtd error: Entrez.esearch(db="pubmed",
term= ('glaucom*'),
retmax=2, usehistory="y",
reldate=7, datetype="edat")
fetch_handle = Entrez.efetch(db="pubmed", retmode="xml",rettype='medline',
webenv=webenv, query_key=query_key)
You will also want to add pubmed_100301.dtd to your repository. I do not
have the query that generated it's dependent XML, but got an separate error
related to its absence yesterday. Oddly, I was able to download the
"hidden" pubmed_100301.dtd, but could not replicate the error. All
following errors focused on the nlmmedlinecitationset_100301.dtd file which
I could not locate until this morning. Perhaps it was just recently posted
to the site. Either way, thanks for the confirmation that I was on the right
track.
regards,
guy
On Thu, Jul 8, 2010 at 3:42 AM, Peter <biopython at maubp.freeserve.co.uk>wrote:
> On Thu, Jul 8, 2010 at 1:52 AM, Guy Eakin <guyeakin at gmail.com> wrote:
> > I am learning biopython and seem to be having trouble parsing efetch
> > generated xml.
> >
> > Maybe I am confused here, but I can't for the life of me Get my xml to
> parse
> > correctly, and it seems to be coming up with a missing dtd error using
> both
> > Medline.parse and Entrez.parse. (traceback for medline below below)
> >
> > nlmmedlinecitationset_100301.
> > dtd and pubmed_100301.dtd seem to be missing from my biopython
> > installation, and unavailable from the following NCBI sites:
> >
> > http://www.ncbi.nlm.nih.gov/dtd/ or
> > http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/
> >
> > My apologies if this is user error; i do not see reference to this DTD
> issue
> > in the archives so am posting the incident. Is this just bad luck during
> my
> > learning curve, or am I missing something conceptual here?
>
> The problem is with the NCBI "hiding" the file by not showing the raw
> contents of that folder, but just an HTML page with a partial list. You
> need this file:
>
>
> http://eutils.ncbi.nlm.nih.gov/corehtml/query/DTD/nlmmedlinecitationset_100301.dtd
>
> I've added this to our repository so the next version of Biopython will
> include it. Please let us know if anything else is missing - what was
> the Entrez request you used to get the XML using this DTD file?
>
> Regards,
>
> Peter
>
More information about the Biopython
mailing list