[Bioperl-l] Fetching > 500 sequences

henrik nilsson rnilsson at clarku.edu
Mon Mar 1 14:27:35 EST 2004


Hi,

It seems that I have problems with fetching more than 500 sequences from 
Genbank using Bioperl. It looks like the script (attached below) fetches all 
the 7000+ sequences, but only 500 make it to the output file. Is there any 
way to get all these 7000+ sequences written to the file - that is, is it 
possible to sidestep the 500 seq. limit?

Thanks for your time,

Rolf




Please find the script below. When I run it, I get

Writing accession number AJ406491
... etc ...
Writing accession number AJ406489
Writing accession number AJ406471
Writing accession number AJ406465
Writing accession number AJ406461
Total number of records found = 7053

but when I type 

[rolf at localhost dir]$ cat data.gb | grep 'BASE COUNT' | wc -l
    500
[rolf at localhost dir]$

It is clear that only 500 seq. were written to the file.

#!/usr/bin/perl -w
use strict;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use IO::String;
use Bio::SeqIO;
use Bio::Seq::RichSeq;


my $query_string = 'Boletales';

my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
                                         -query=>$query_string);
my $out = Bio::SeqIO->new(-file=>">data.gb", -format=>'genbank');

my $count = $query->count;

my $gb = new Bio::DB::GenBank();

my $stream = $gb->get_Stream_by_query($query);

while (my $seq = $stream->next_seq) {
        print "Writing accession number ", $seq->accession_number,"\n";
        $out->write_seq($seq);
        }

print "Total number of records found = $count\n";

exit;




More information about the Bioperl-l mailing list