[Biopython] Entrez.read return value is typed as a string??

Ben O'Loghlin bassbabyface at yahoo.com
Tue Oct 27 15:12:13 UTC 2009


Hi all,

I'm new to BioPython, having spent < 4 hours playing with it, and I'm mighty
impressed with what it can do for me once I get it working. Unfortunately
I've spent about 3.5 of those hours inanely grappling with Entrez.read, so I
turn to more experienced BioPythoneers for assistance.

I'm trying to use Entrez to extract and manipulate records from PubMed, and
I'm stumped. I was expecting the return value of Entrez.read to be a
structured object, and instead it seems to return a string which would
require further parsing to do anything useful with.

I'm not sure if this is the expected output and I have misunderstood, or if
PubMed is just returning results in unexpected formats which break the
parser in Entrez.read, or if Bio just doesn't work after midnight (2:06 am
Australian EST).

Is anyone able/willing to assist? The goal here is to have some way of
extracting individual fields from the returned records, e.g. print out the
Abstract for PMID 17206916.

I'm using BioPython 1.5.2 and Python 2.6.4 on Vista. Script and output
below...

Many thanks in advance,
Ben

#########################################################################
#  Biotest.py
#########################################################################
from Bio import Entrez

PMID = "17206916"
database = "pubmed"

# Fetch the full article details
handle1 = Entrez.efetch(db=database, id=PMID)
full = handle1.read()
print "\nProperties of full record object: "
print type(full)
print
print full[0:180]

#Fetch and print the summary details
handle2 = Entrez.esummary(db=database, id=PMID)
summary = handle2.read()
print "\nProperties of summary record object: "
print type(summary)
print
print summary[0:300]
#########################################################################


#########################################################################
#  Output from Biotest.py
#########################################################################

C:\Data\Personal\Dev\Python\PubMed>c:\Python26\python.exe biotest.py

Properties of full record object:
<type 'str'>

<html><head><title>PmFetch response</title></head><body>
<pre>
Pubmed-entry ::= {
  pmid 17206916,
  medent {
    em std {
      year 2007,
      month 1,
      day 8
    },
    ci

Properties of summary record object:
<type 'str'>

<?xml version="1.0"?>
<!DOCTYPE eSummaryResult PUBLIC "-//NLM//DTD eSummaryResult, 29 October
2004//EN
" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eSummary_041029.dtd">
<eSummaryResult>
<DocSum>
        <Id>17206916</Id>
        <Item Name="PubDate" Type="Date">2006</Item>
        <Item Name="EPubDate" Type="Date"></

#########################################################################








More information about the Biopython mailing list