<div dir="ltr"><div><div><div>Hi Peter,<br><br></div>It seems that it was indeed a temporary error. Thanks for your help!<br><br></div>Best,<br></div>Lev<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 9, 2015 at 4:50 AM, Peter Cock <span dir="ltr"><<a href="mailto:p.j.a.cock@googlemail.com" target="_blank">p.j.a.cock@googlemail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Lev,<br>
<br>
Which version of Biopython do you have, and which GI number(s) fail?<br>
<br>
The very fact the problem tag was "Error" suggests it was actually<br>
an error message, not a sequence record - perhaps a temporary error?<br>
<br>
This worked for me:<br>
<br>
from Bio import Entrez<br>
Entrez.email = "..."<br>
handle = Entrez.efetch(db="protein", id="12345678", retmode="xml")<br>
record = Entrez.read(handle, validate=True)<br>
handle.close()<br>
print(record)<br>
<br>
Using some id values like "1" could give an "empty" XML record,<br>
which to me looks like an NCBI bug:<br>
<br>
<?xml version="1.0"?><br>
<!DOCTYPE GBSet PUBLIC "-//NCBI//NCBI GBSeq/EN"<br>
"<a href="http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd" rel="noreferrer" target="_blank">http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd</a>"><br>
<GBSet><br>
<br>
</GBSet><br>
<br>
This is parsed as [] which is reasonable (empty list).<br>
<br>
Other values like "0" and "-1" give an HTTP Error 400: Bad Request<br>
(which is good - a nice clear and explicit error).<br>
<br>
See also:<br>
<br>
Peter<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On Fri, Sep 4, 2015 at 8:16 PM, Lev Tsypin <<a href="mailto:ltsypin@uchicago.edu">ltsypin@uchicago.edu</a>> wrote:<br>
> Hi Peter,<br>
><br>
> This is me trying to get protein sequences from the protein database. I have<br>
> a gi code in the variable 'gi' that I pass into the Entrez.efetch function.<br>
> Specifically, I use:<br>
><br>
> handle = Entrez.efetch(db='protein', id=gi, retmode='xml')<br>
> record = Entrez.read(handle)<br>
><br>
> Best,<br>
> Lev<br>
><br>
> On Fri, Sep 4, 2015 at 11:12 AM, Peter Cock <<a href="mailto:p.j.a.cock@googlemail.com">p.j.a.cock@googlemail.com</a>><br>
> wrote:<br>
>><br>
>> Hi Lev,<br>
>><br>
>> Which database was this with? Each has somewhat different XML behaviour./<br>
>><br>
>> The NCBI have been quite good about versioning the DTD files -<br>
>> normally they add new files rather than edit an existing DTD file. So<br>
>> unless you've had a warning from Biopython there should be no reason<br>
>> to download a new DTD file.<br>
>><br>
>> Peter<br>
>><br>
>> On Fri, Sep 4, 2015 at 3:44 PM, Lev Tsypin <<a href="mailto:ltsypin@uchicago.edu">ltsypin@uchicago.edu</a>> wrote:<br>
>> > Hi all,<br>
>> ><br>
>> > I am encountering this error when using Bio.Entrez:<br>
>> ><br>
>> > Bio.Entrez.Parser.ValidationError: Failed to find tag 'Error' in the<br>
>> > DTD. To<br>
>> > skip all tags that are not represented in the DTD, please call<br>
>> > Bio.Entrez.read or Bio.Entrez.parse with validate=False.<br>
>> ><br>
>> > I've found a discussion of the same issue from about a year ago, so I<br>
>> > figure<br>
>> > the the NCBI updated their DTD file in a strange way. I found several<br>
>> > solutions: would you recommend that I download the new DTD file into my<br>
>> > local copy of Biopython or run Entrez.read with validate=False?<br>
>> ><br>
>> > Best regards,<br>
>> > Lev Tsypin<br>
>> ><br>
>> > _______________________________________________<br>
>> > Biopython-dev mailing list<br>
>> > <a href="mailto:Biopython-dev@mailman.open-bio.org">Biopython-dev@mailman.open-bio.org</a><br>
>> > <a href="http://mailman.open-bio.org/mailman/listinfo/biopython-dev" rel="noreferrer" target="_blank">http://mailman.open-bio.org/mailman/listinfo/biopython-dev</a><br>
><br>
><br>
</div></div></blockquote></div><br></div>