: [Bioperl-l] Bio::DB::Query::GenBank retrieves fewer sequences
than Webbrowser query
Chris Dwan (CCGB)
cdwan at mail.ahc.umn.edu
Wed Mar 24 17:37:09 EST 2004
I'll admit that I was totally unable to get the results I wanted with the
-query => 'whatever' option. I attribute that to my lack of clue WRT
ENTREZ. Since we're fortunate enough to have an in-house relational
genbank, I use SQL to get the list of accessions I want, and then:
my $gb = new Bio::DB::GenBank;
my $seqstream = $gb->get_Stream_by_acc(\@accessions);
my $seqIO = Bio::SeqIO->new(-fh => $out, -format => 'Fasta');
while (my $seq = $seqstream->next_seq) {
$seq->display_id($seq->display_id() . "." . $seq->version);
$seqIO->write_seq($seq);
}
Which has given remarkably consistent results.
Not that this addresses the observed problem, but I suspect that it's on
the query end, not the sequence retrieval end.
-Chris Dwan
> Just for kicks I tried to duplicate the problem (I get the same number
> of sequences from NCBI's web sequin tool as Jrgen, but using the
> Bio::DB::Query:Genbank method I get 644 sequences (not less than 100,
> but not the 5000+ we are expecting). Placing an escape backslash before
> the brackets does not seem to help me:
>
> --- my test script below ---
>
> #!/usr/bin/perl
> use strict;
> use Bio::DB::GenBank;
> use Bio::DB::Query::GenBank;
>
> my $gb=new
> Bio::DB::GenBank;
> my $query = Bio::DB::Query::GenBank->new
> (-query => 'Mus\[Organism] AND exon NOT mRNA NOT cDNA',
> -db => 'Nucleotide');
> my $seqio = $gb->get_Stream_by_query($query);
> my $numresults=0;
> while( my $seq = $seqio->next_seq ) { $numresults++; }
> print "Num results: $numresults\n";
>
>
>
>
>
>
> --
>
> :.-----.----------.----------.-----.:
> T.D. Houfek
> tdhoufek-AT-unity-DOT-ncsu-DOT-edu
> Tobacco Genome Initiative
> NCSU, Raleigh, NC 27606
> :.-----.----------.----------.-----.:
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list