<div dir="ltr"><div><div><div>Hi Peter,<br><br></div>It seems that it was indeed a temporary error. Thanks for your help!<br><br></div>Best,<br></div>Lev<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 9, 2015 at 4:50 AM, Peter Cock <span dir="ltr">&lt;<a href="mailto:p.j.a.cock@googlemail.com" target="_blank">p.j.a.cock@googlemail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Lev,<br>

<br>

Which version of Biopython do you have, and which GI number(s) fail?<br>

<br>

The very fact the problem tag was &quot;Error&quot; suggests it was actually<br>

an error message, not a sequence record - perhaps a temporary error?<br>

<br>

This worked for me:<br>

<br>

from Bio import Entrez<br>

Entrez.email = &quot;...&quot;<br>

handle = Entrez.efetch(db=&quot;protein&quot;, id=&quot;12345678&quot;, retmode=&quot;xml&quot;)<br>

record = Entrez.read(handle, validate=True)<br>

handle.close()<br>

print(record)<br>

<br>

Using some id values like &quot;1&quot; could give an &quot;empty&quot; XML record,<br>

which to me looks like an NCBI bug:<br>

<br>

&lt;?xml version=&quot;1.0&quot;?&gt;<br>

 &lt;!DOCTYPE GBSet PUBLIC &quot;-//NCBI//NCBI GBSeq/EN&quot;<br>

&quot;<a href="http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd" rel="noreferrer" target="_blank">http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd</a>&quot;&gt;<br>

 &lt;GBSet&gt;<br>

<br>

&lt;/GBSet&gt;<br>

<br>

This is parsed as [] which is reasonable (empty list).<br>

<br>

Other values like &quot;0&quot; and &quot;-1&quot; give an HTTP Error 400: Bad Request<br>

(which is good - a nice clear and explicit error).<br>

<br>

See also:<br>

<br>

Peter<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

On Fri, Sep 4, 2015 at 8:16 PM, Lev Tsypin &lt;<a href="mailto:ltsypin@uchicago.edu">ltsypin@uchicago.edu</a>&gt; wrote:<br>

&gt; Hi Peter,<br>

&gt;<br>

&gt; This is me trying to get protein sequences from the protein database. I have<br>

&gt; a gi code in the variable &#39;gi&#39; that I pass into the Entrez.efetch function.<br>

&gt; Specifically, I use:<br>

&gt;<br>

&gt;         handle = Entrez.efetch(db=&#39;protein&#39;, id=gi, retmode=&#39;xml&#39;)<br>

&gt;         record = Entrez.read(handle)<br>

&gt;<br>

&gt; Best,<br>

&gt; Lev<br>

&gt;<br>

&gt; On Fri, Sep 4, 2015 at 11:12 AM, Peter Cock &lt;<a href="mailto:p.j.a.cock@googlemail.com">p.j.a.cock@googlemail.com</a>&gt;<br>

&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; Hi Lev,<br>

&gt;&gt;<br>

&gt;&gt; Which database was this with? Each has somewhat different XML behaviour./<br>

&gt;&gt;<br>

&gt;&gt; The NCBI have been quite good about versioning the DTD files -<br>

&gt;&gt; normally they add new files rather than edit an existing DTD file. So<br>

&gt;&gt; unless you&#39;ve had a warning from Biopython there should be no reason<br>

&gt;&gt; to download a new DTD file.<br>

&gt;&gt;<br>

&gt;&gt; Peter<br>

&gt;&gt;<br>

&gt;&gt; On Fri, Sep 4, 2015 at 3:44 PM, Lev Tsypin &lt;<a href="mailto:ltsypin@uchicago.edu">ltsypin@uchicago.edu</a>&gt; wrote:<br>

&gt;&gt; &gt; Hi all,<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I am encountering this error when using Bio.Entrez:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Bio.Entrez.Parser.ValidationError: Failed to find tag &#39;Error&#39; in the<br>

&gt;&gt; &gt; DTD. To<br>

&gt;&gt; &gt; skip all tags that are not represented in the DTD, please call<br>

&gt;&gt; &gt; Bio.Entrez.read or Bio.Entrez.parse with validate=False.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; I&#39;ve found a discussion of the same issue from about a year ago, so I<br>

&gt;&gt; &gt; figure<br>

&gt;&gt; &gt; the the NCBI updated their DTD file in a strange way. I found several<br>

&gt;&gt; &gt; solutions: would you recommend that I download the new DTD file into my<br>

&gt;&gt; &gt; local copy of Biopython or run Entrez.read with validate=False?<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Best regards,<br>

&gt;&gt; &gt; Lev Tsypin<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; _______________________________________________<br>

&gt;&gt; &gt; Biopython-dev mailing list<br>

&gt;&gt; &gt; <a href="mailto:Biopython-dev@mailman.open-bio.org">Biopython-dev@mailman.open-bio.org</a><br>

&gt;&gt; &gt; <a href="http://mailman.open-bio.org/mailman/listinfo/biopython-dev" rel="noreferrer" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython-dev</a><br>

&gt;<br>

&gt;<br>

</div></div></blockquote></div><br></div>