[Bioperl-l] Best method for downloading 100 sequences
Barry Moore
bmoore at genetics.utah.edu
Fri Sep 2 14:53:17 EDT 2005
Most of the bioperl remote database modules have something like this
from Bio::DB::GenBank...
my $seqio = $gb->get_Stream_by_acc(['AC013798', 'AC021953'] );
You should put some sore of a sleep in there. It runs in my mind that
NCBI asks for 3 seconds, but I could be wrong about that. 100 shouldn't
be a problem for them. 1000+ you might want to think about downloading
directly by ftp and parsing at home.
Barry
-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Amir Karger
Sent: Friday, September 02, 2005 7:55 AM
To: 'bioperl-l at portal.open-bio.org'
Subject: [Bioperl-l] Best method for downloading 100 sequences
Hi.
I'm using Bioperl's nice get_sequence in my Scriptome toolbox, to fetch
a
single sequence. What would be the best method for downloading 100
sequences? Do I write a loop to call get_sequence N times? Will the
various
websites get angry at me for doing that? Would they be less angry if I
did a
1-second sleep after each download? I know NCBI has methods to pull in N
sequences, but I don't know whether Swiss et al. do too. I'm happy to
use
other Bioperl code, rather than get_sequence. I just need to have a
script
that people can cut and paste, where they just input the filename with
sequence IDs and the database to download from (sort of like
http://cgr.harvard.edu/cbg/scriptome/Tools/Fetch.html#fetch_a_sequence_f
rom_
a_popular_internet_database__fetch_sequence_web_)
Thanks,
-Amir Karger
_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list