[Bioperl-l] EUtilities overhaul started

Chris Fields cjfields at uiuc.edu
Sun Jun 3 23:52:39 UTC 2007


On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote:

> ...
> Hi chris,
> Being a frequent user of EUtilities, hopefully this api facelift  
> and upcoming howto will definitely be more helpful.
> Anyway, one thing i noticed that for each eutil call such as  
> efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has  
> to be
> instantiated. And thereafter it cannot be set during runtime such as
> $eutils->id('ids'), for example....
>
> my $eutils = Bio::DB::Eutilities->new ( -id => $id,
>                                        -eutil => 'esummary',
>                                        -db => 'protein',
>                                      );
> my $ct = $eutils->get_response->content();
>
> ## -- now i cannot do this...
> $eutils->id($newid);
> my $ct = $eutils->get_response->content();

I'll have to check up on that, though changing id() should work with  
the old API.  It won't matter with the new API (it works fine), but  
it is still troubling...

> Is the new api going to address something along this line or is  
> there currently anyway to reuse
> the object.
> Thanks again for this nice toolkit.
>
> -siddhartha

The old API was based upon the idea of creating discrete user agents  
for each eutil to retrieve data.  The problem with the old interface  
is it attempts to do too much (take care of parameters, set up  
requests, retrieve responses, parse data, etc), and many tasks  
required instantiating a new EUtilities object.  I was never really  
satisfied with it.

The new interface is a composition of three classes: the web user  
agent (LWP::UserAgent), a class encapsulating parameter handling, and  
a parser class (all which can be used independently if needed).  When  
parameters change a new request is made 'lazily' (i.e. only when  
needed).  Similarly, when data is requested after any parameter  
change a new parser instance is created and the new response is parsed.

With that in mind you can now do the following:
----------------------------------------
my @params = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA1',
               -retmax => 100);

my $eutil = Bio::DB::EUtilities->new(@params);

# no need to get response first; get_ids() calls that if needed

my @ids = $eutil->get_ids;

# below changes only those parameters, leaves all others set as before
$eutil->set_parameters(-eutil => 'efetch',
                        -id  => \@ids,
                        -retmode => 'text',
                        -rettype => 'fasta');

# sends streamed content directly to a file
$eutil->get_response(-content_file => 'seqs.fas');

# or to a LWP::UserAgent-supported request callback
$eutil->get_response(-content_cb => \&my_cb);

my @newparams = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA2',
               -retmax => 100);

# Resets eutility to passed parameters (or undef)
$eutil->reset_parameters(@newparams);

# retrieve new IDs
my @new_ids = $eutil->get_ids;
----------------------------------------

Note the same eutil object is used for all of the above, so to answer  
your last question, yes, you should be able to create data pipelines  
using the same object if necessary.

chris




More information about the Bioperl-l mailing list