[Bioperl-l] get_Stream_by_query Terminates Prematurely

bergeycm cmb433 at nyu.edu
Mon May 10 02:22:52 UTC 2010


Hi all,

I'm attempting to query GenBank for all sequences' lengths for a given
taxon. I'm using get_Stream_by_query(), but only to grab the species,
length, and accession. The genus of interest has almost 500,000 GB entries,
though, and my code hangs up at odd points in the info-gathering loop.
(Often after only 300 or 400 iterations.) The problem is that
$stream_obj->next_seq (of Bio::SeqIO::genbank) eventually comes back
undefined.

I've tried wrapping the next_seq portion of the code in an eval block, but
to no avail. Is there a way to split a query into a bunch of small streams
that aren't too much to ask? Or is there a way to pick up a dropped SeqIO
stream? I think the connection is timing out and the stream is being lost.
Any advice is greatly appreciated, as I'm fairly new to BioPerl.

- bergeycm



use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;


# Get general things ready to go for querying GenBank
my %options;
$options{'-maxids'} = '500000';		# There are presently 460,184 sequences
$options{'-db'} = 'nucleotide';
$options{'-query'} = "Pongo [ORGN]";	# Orangutans


my $query_obj = Bio::DB::Query::GenBank->new(%options);	
my $total = $query_obj->count;

my $gb_obj = Bio::DB::GenBank->new();
my $stream_obj = $gb_obj->get_Stream_by_query($query_obj);

# Restrict info to just what I'll be using. No sequence necessary.
my $builder = $stream_obj->sequence_builder();
$builder->want_none();
$builder->add_wanted_slot('species','length','accession');

my $c = 0;

for (1 .. $total) {
	eval {
		my $seq_obj =  $stream_obj->next_seq;
		my $flavor = $seq_obj->species;			
		print $c, "\t", $flavor->scientific_name, " (", $flavor->id, ")\t",
$seq_obj->length, "\t", $seq_obj->accession, "\n";			
	};

	if ($@) {
		print $!, '\n';
	}
	
	# Pause for a little over a third of a second
	select(undef, undef, undef, 0.35);
	
	$c++;
}



-- 
View this message in context: http://old.nabble.com/get_Stream_by_query-Terminates-Prematurely-tp28506482p28506482.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.



More information about the Bioperl-l mailing list