[Bioperl-l] EUtilities, was Re: PDB Parser

Chris Fields cjfields at uiuc.edu
Tue Aug 21 04:06:26 UTC 2007


Bernd,

Just in case you weren't aware, I have changed several aspects of  
EUtilities since the 1.5.2 release, so any code in the HOWTO cookbook  
applies ONLY to the version found in CVS (there is a big note at the  
top stating such).  This should be the finalized API which I intend  
on supporting from this point on.  The reason I indicate that is  
there are several giveaways which indicate you are using the older  
API from 1.5.2 (using next_cookie, for instance).

The following modification of your script (using the API in bioperl- 
live) works for me.  You should be able to do something similar with  
the older API as well but I haven't tried.  Note that PMC full-text  
retrieval only works if the article is declared 'open-access'; not  
all journals allow that.  Also, any full-text is only available as  
XML which (I'm guessing here) is transformed to HTML for PMC.

....
my $agent = Bio::DB::EUtilities->new(-eutil      => 'esearch',
-db         => $db,
-term       => $query,
-usehistory => 'y');

my $ct = $agent->get_count;

print "Count = $ct\n";

my $history = $agent->next_History;

if ($fetch eq 'yes') {
   my ($retmax, $retstart) = (1,0);
   while ($retstart < $ct) {
	  $agent->set_parameters(
               -eutil => 'efetch',
               -history => $history,
               -rettype => 'xml',
               -retmax => $retmax,
               -retstart => $retstart,
		  );
           $agent->get_Response(-file => ">./papers/paper_ 
$retstart.xml");
           $retstart += $retmax;
   }
}

------------------------------

It may also be possible to grab the LinkOut for these and try to nab  
the PDF or use the DOI, but I haven't tried anything like that.

chris

On Aug 20, 2007, at 2:03 PM, Bernd Mueller wrote:

> I attached my script.
>
> Actually I tried to download all articles to a certain search term  
> with
> that script. The problem was that the retrieved documents were not  
> free
> as mentioned in the documentation of EUtilities on the NCBI page. So
> many of the downloaded documents in xml-format were just dummies
> containing only the abstract but not the fulltext article.
>
> Bernd
>
> Chris Fields wrote:
>> Just curious, but what kind of query were you trying?  It might be  
>> worth trying to work through it to add as an example to the  
>> cookbook page.
>> chris




More information about the Bioperl-l mailing list