[Bioperl-l] Fetching > 500 sequences
martin
9904982 at sms.ed.ac.uk
Wed Mar 3 08:39:12 EST 2004
Hi,
I've experienced a similar problem with the 500 sequence limit. I've
not found a way round it; my solution has been to download the sequences
you want in GenBank format using SRS from the ncbi website, then open it
thus:
my $stream=Bio::SeqIO->new(-file => 'filename.genbank', -format =>
'GenBank');
#process each record in turn..
while (my $seq=$stream->next_seq()){
do_something
}
hope this helps.
Martin
On Mon, 2004-03-01 at 19:27, henrik nilsson wrote:
> Hi,
>
> It seems that I have problems with fetching more than 500 sequences from
> Genbank using Bioperl. It looks like the script (attached below) fetches all
> the 7000+ sequences, but only 500 make it to the output file. Is there any
> way to get all these 7000+ sequences written to the file - that is, is it
> possible to sidestep the 500 seq. limit?
>
> Thanks for your time,
>
> Rolf
>
>
>
>
> Please find the script below. When I run it, I get
>
> Writing accession number AJ406491
> ... etc ...
> Writing accession number AJ406489
> Writing accession number AJ406471
> Writing accession number AJ406465
> Writing accession number AJ406461
> Total number of records found = 7053
>
> but when I type
>
> [rolf at localhost dir]$ cat data.gb | grep 'BASE COUNT' | wc -l
> 500
> [rolf at localhost dir]$
>
> It is clear that only 500 seq. were written to the file.
>
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenBank;
> use Bio::DB::Query::GenBank;
> use IO::String;
> use Bio::SeqIO;
> use Bio::Seq::RichSeq;
>
>
> my $query_string = 'Boletales';
>
> my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
> -query=>$query_string);
> my $out = Bio::SeqIO->new(-file=>">data.gb", -format=>'genbank');
>
> my $count = $query->count;
>
> my $gb = new Bio::DB::GenBank();
>
> my $stream = $gb->get_Stream_by_query($query);
>
> while (my $seq = $stream->next_seq) {
> print "Writing accession number ", $seq->accession_number,"\n";
> $out->write_seq($seq);
> }
>
> print "Total number of records found = $count\n";
>
> exit;
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
Martin Jones
Blaxter Nematode Genomics Lab
ICAPB
Ashworth Labs
Kings Buildings
University of Edinburgh
Edinburgh
0131 650 6761
9904982 at sms.ed.ac.uk
More information about the Bioperl-l
mailing list