[Bioperl-l] Bio::DB::Query::GenBank

Jason Stajich jason.stajich at duke.edu
Mon Nov 29 09:43:28 EST 2004


Lincoln did some fixes this summer which I think did this 2 step 
process for you in Bio::DB::GenBank (another reason we need to get 
1.5.0 out there for people to use).  Any chance you can try the RC1 or 
CVS live code as well to see if you are hitting the same problems.

-jason
On Nov 29, 2004, at 9:17 AM, Marc Logghe wrote:

> Hi,
> I think you will always bump into that limit; it is the limit ncbi is 
> using with efetch.
> I don't know how it is internally done by Bio::DB::Query::GenBank but 
> it should go via a 2 step process:
> 1) you perform a query and you get a webenv and query key back
> 2) you fetch your sequences by passing your webenv and query key and 
> explicitely requesting your record numbers in chunks of 500.
> I also never succeeded in fetching more that 500 sequences with 
> Bio::DB::Query::GenBank.
> I am currently using a non bioperl script based on 
> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_example.pl.
> NCBI also asks to run these kind of queries at night EST, in the 
> weekend and with a sleep of at least 5 sec between every fetch of 500 
> records.
>
> HTH,
> Marc
>
>> -----Original Message-----
>> From: Aaron J. Mackey [mailto:amackey at pcbi.upenn.edu]
>> Sent: Monday, November 29, 2004 2:59 PM
>> To: Wuming Gong
>> Cc: Bioperl-l at portal.open-bio.org
>> Subject: Re: [Bioperl-l] Bio::DB::Query::GenBank
>>
>>
>>
>> If you try again late at night (meaning late at night EST),
>> you may get
>> all 5000 hits; NCBI seems to have implemented a limit of 500
>> entries in
>> batch retrieval when network load is already high, but you may be
>> successful during non-peak hours ...
>>
>> -Aaron
>>
>> On Nov 29, 2004, at 4:26 AM, Wuming Gong wrote:
>>
>>> Hi Mona,
>>>
>>> I have met the same kind of problem. You may pull down the sequences
>>> once by less than 500 and It works.
>>>
>>> Wuming
>>>
>>>
>>> On Thu, 04 Nov 2004 21:12:40 -0700, Ligia Mateiu
>> <lmateiu at ualberta.ca>
>>> wrote:
>>>> Hi all,
>>>> I used a query for which exists >5000 hits in Genbank, but my code
>>>> retrieved just the very fist 500.
>>>>
>>>> Any idea why?
>>>>
>>>> Thanks a lot,
>>>> Mona
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> --
>> Aaron J. Mackey, Ph.D.
>> Dept. of Biology, Goddard 212
>> University of Pennsylvania       email:  amackey at pcbi.upenn.edu
>> 415 S. University Avenue         office: 215-898-1205
>> Philadelphia, PA  19104-6017     fax:    215-746-6697
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list