: [Bioperl-l] Bio::DB::Query::GenBank retrieves fewer sequences
than Webbrowser query
Paulo Almeida
paulo.david at netvisao.pt
Thu Mar 25 06:21:45 EST 2004
Yes, escaping characters doesn't seem to have anything to do with this
problem. I suggested it because it worked for me in a different
situation. However, if you use the count method, instead of cycling
through all the results, to get the number of returned sequences, you
get 5066, as it should be:
#!/usr/bin/perl
use strict;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
my $gb=new
Bio::DB::GenBank;
my $query = Bio::DB::Query::GenBank->new
(-query => 'Mus[Organism] exon NOT mRNA NOT cDNA',
-db => 'Nucleotide');
my $seqio = $gb->get_Stream_by_query($query);
print "Num results:" , $query->count , "\n";
I'm looking into it further, but I don't know what the problem could be.
-Paulo Almeida
T.D. Houfek wrote:
>Hmm...
>
>Just for kicks I tried to duplicate the problem (I get the same number
>of sequences from NCBI's web sequin tool as Jrgen, but using the
>Bio::DB::Query:Genbank method I get 644 sequences (not less than 100,
>but not the 5000+ we are expecting). Placing an escape backslash before
>the brackets does not seem to help me:
>
>--- my test script below ---
>
>#!/usr/bin/perl
>use strict;
>use Bio::DB::GenBank;
>use Bio::DB::Query::GenBank;
>
>my $gb=new
>Bio::DB::GenBank;
>my $query = Bio::DB::Query::GenBank->new
> (-query => 'Mus\[Organism] AND exon NOT mRNA NOT cDNA',
> -db => 'Nucleotide');
>my $seqio = $gb->get_Stream_by_query($query);
>my $numresults=0;
>while( my $seq = $seqio->next_seq ) { $numresults++; }
>print "Num results: $numresults\n";
>
More information about the Bioperl-l
mailing list