: [Bioperl-l] Bio::DB::Query::GenBank retrieves fewer sequences than Webbrowser query

Chris Dwan (CCGB) cdwan at mail.ahc.umn.edu
Wed Mar 24 17:37:09 EST 2004


I'll admit that I was totally unable to get the results I wanted with the
-query => 'whatever' option.  I attribute that to my lack of clue WRT
ENTREZ.  Since we're fortunate enough to have an in-house relational
genbank, I use SQL to get the list of accessions I want, and then:

my $gb        = new Bio::DB::GenBank;
my $seqstream = $gb->get_Stream_by_acc(\@accessions);
my $seqIO     = Bio::SeqIO->new(-fh => $out, -format => 'Fasta');
while (my $seq = $seqstream->next_seq) {
  $seq->display_id($seq->display_id() . "." . $seq->version);
  $seqIO->write_seq($seq);
}

Which has given remarkably consistent results.

Not that this addresses the observed problem, but I suspect that it's on
the query end, not the sequence retrieval end.

-Chris Dwan

> Just for kicks I tried to duplicate the problem (I get the same number
> of sequences from NCBI's web sequin tool as Jrgen, but using the
> Bio::DB::Query:Genbank method I get 644 sequences (not less than 100,
> but not the 5000+ we are expecting).  Placing an escape backslash before
> the brackets does not seem to help me:
>
> --- my test script below ---
>
> #!/usr/bin/perl
> use strict;
> use Bio::DB::GenBank;
> use Bio::DB::Query::GenBank;
>
> my $gb=new
> Bio::DB::GenBank;
> my $query = Bio::DB::Query::GenBank->new
>         (-query => 'Mus\[Organism] AND exon NOT mRNA NOT cDNA',
>          -db => 'Nucleotide');
> my $seqio = $gb->get_Stream_by_query($query);
> my $numresults=0;
> while( my $seq = $seqio->next_seq ) { $numresults++; }
> print "Num results: $numresults\n";
>
>
>
>
>
>
> --
>
> :.-----.----------.----------.-----.:
>  T.D. Houfek
>  tdhoufek-AT-unity-DOT-ncsu-DOT-edu
>  Tobacco Genome Initiative
>  NCSU, Raleigh, NC 27606
> :.-----.----------.----------.-----.:
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>


More information about the Bioperl-l mailing list