[Bioperl-l] get_Stream_by_query Terminates Prematurely
bergeycm
cmb433 at nyu.edu
Mon May 10 02:22:52 UTC 2010
Hi all,
I'm attempting to query GenBank for all sequences' lengths for a given
taxon. I'm using get_Stream_by_query(), but only to grab the species,
length, and accession. The genus of interest has almost 500,000 GB entries,
though, and my code hangs up at odd points in the info-gathering loop.
(Often after only 300 or 400 iterations.) The problem is that
$stream_obj->next_seq (of Bio::SeqIO::genbank) eventually comes back
undefined.
I've tried wrapping the next_seq portion of the code in an eval block, but
to no avail. Is there a way to split a query into a bunch of small streams
that aren't too much to ask? Or is there a way to pick up a dropped SeqIO
stream? I think the connection is timing out and the stream is being lost.
Any advice is greatly appreciated, as I'm fairly new to BioPerl.
- bergeycm
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
# Get general things ready to go for querying GenBank
my %options;
$options{'-maxids'} = '500000'; # There are presently 460,184 sequences
$options{'-db'} = 'nucleotide';
$options{'-query'} = "Pongo [ORGN]"; # Orangutans
my $query_obj = Bio::DB::Query::GenBank->new(%options);
my $total = $query_obj->count;
my $gb_obj = Bio::DB::GenBank->new();
my $stream_obj = $gb_obj->get_Stream_by_query($query_obj);
# Restrict info to just what I'll be using. No sequence necessary.
my $builder = $stream_obj->sequence_builder();
$builder->want_none();
$builder->add_wanted_slot('species','length','accession');
my $c = 0;
for (1 .. $total) {
eval {
my $seq_obj = $stream_obj->next_seq;
my $flavor = $seq_obj->species;
print $c, "\t", $flavor->scientific_name, " (", $flavor->id, ")\t",
$seq_obj->length, "\t", $seq_obj->accession, "\n";
};
if ($@) {
print $!, '\n';
}
# Pause for a little over a third of a second
select(undef, undef, undef, 0.35);
$c++;
}
--
View this message in context: http://old.nabble.com/get_Stream_by_query-Terminates-Prematurely-tp28506482p28506482.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
More information about the Bioperl-l
mailing list