[Bioperl-l] Fetching > 500 sequences
henrik nilsson
rnilsson at clarku.edu
Mon Mar 1 14:27:35 EST 2004
Hi,
It seems that I have problems with fetching more than 500 sequences from
Genbank using Bioperl. It looks like the script (attached below) fetches all
the 7000+ sequences, but only 500 make it to the output file. Is there any
way to get all these 7000+ sequences written to the file - that is, is it
possible to sidestep the 500 seq. limit?
Thanks for your time,
Rolf
Please find the script below. When I run it, I get
Writing accession number AJ406491
... etc ...
Writing accession number AJ406489
Writing accession number AJ406471
Writing accession number AJ406465
Writing accession number AJ406461
Total number of records found = 7053
but when I type
[rolf at localhost dir]$ cat data.gb | grep 'BASE COUNT' | wc -l
500
[rolf at localhost dir]$
It is clear that only 500 seq. were written to the file.
#!/usr/bin/perl -w
use strict;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use IO::String;
use Bio::SeqIO;
use Bio::Seq::RichSeq;
my $query_string = 'Boletales';
my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
-query=>$query_string);
my $out = Bio::SeqIO->new(-file=>">data.gb", -format=>'genbank');
my $count = $query->count;
my $gb = new Bio::DB::GenBank();
my $stream = $gb->get_Stream_by_query($query);
while (my $seq = $stream->next_seq) {
print "Writing accession number ", $seq->accession_number,"\n";
$out->write_seq($seq);
}
print "Total number of records found = $count\n";
exit;
More information about the Bioperl-l
mailing list