[Bioperl-l] Fetching > 500 sequences

Brad Chapman chapmanb at uga.edu
Wed Mar 3 10:51:17 EST 2004


Hi Rolf, Martin;

> > It seems that I have problems with fetching more than 500 sequences from 
> > Genbank using Bioperl. It looks like the script (attached below) fetches all 
> > the 7000+ sequences, but only 500 make it to the output file. Is there any 
> > way to get all these 7000+ sequences written to the file - that is, is it 
> > possible to sidestep the 500 seq. limit?

I actually debugged and fixed this problem recently for Biopython --
it looks like a change in the way EUtils works. If you pass 'retmax'
to the eutils URL then it will only give you back at max 500
sequences, no matter what you pass for this parameter. The fix I
found that worked was to not pass 'retmax'.

The attached patch to Bio/DB/Query/GenBank.pm should fix the
problem, if similar symptoms equal similar fixes in this case. An
actual Perl/BioPerl person should look at this, though, as I'm not
to be trusted for coding Perl :-).

Hope this helps.
Brad
-------------- next part --------------
*** Bio/DB/Query/GenBank.pm.orig	Tue Sep  9 17:29:00 2003
--- Bio/DB/Query/GenBank.pm	Wed Mar  3 10:44:21 2004
***************
*** 197,203 ****
    $method = 'get';
    $base   = ESEARCH;
    push @params,('term'   => $self->query);
!   push @params,('retmax' => $self->{'_count'} || MAXENTRY);
    ($method,$base, at params);
  }
  
--- 197,204 ----
    $method = 'get';
    $base   = ESEARCH;
    push @params,('term'   => $self->query);
!   # Providing 'retmax' limits queries to 500 sequences
!   # push @params,('retmax' => $self->{'_count'} || MAXENTRY);
    ($method,$base, at params);
  }
  


More information about the Bioperl-l mailing list