[Bioperl-l] Bio::DB::Query::GenBank

Josh Lauricha laurichj at bioinfo.ucr.edu
Tue May 6 13:46:35 EDT 2003


Using the following code (From the Bio::DB::Query::GenBank doc):
#!/usr/bin/perl -w
use Bio::DB::GenBank;
use Bio::SeqIO;
use strict;

my $gb     = new Bio::DB::GenBank;
my $seqin  = new Bio::SeqIO(-format => 'efa');
my $seqout = new Bio::SeqIO(-format => 'efa');

my $seqio = $gb->get_Stream_by_query('Oryza sativa[Organism] AND EST');

while( my $seq =  $seqio->next_seq ) {
	            print "seq length is ", $seq->length,"\n";
}

I get the following error:
Warning(s) from GenBank: 
                <FieldNotFound>Organism</FieldNotFound>

However, if I goto www.ncbi.nih.gov and type in:
'Oryza sativa[Organism] AND EST'
I get something around 18k hits in both the nucleotide and protien databases.

I need to find more information about some sequences, I hope to find
their gi numbers. However, none of the data I have is specific to the
genes, so my thought was to search for the org name the seq are from,
then to compare the seq from genbank to the ones I have.

I have:
1) An accession number that seems to be the GenBank LOCUS id, which
   is not a valid search field.
2) A nick name similar to SWProt's, but not identical. 
3) For most, a SWProt accession
4) A description.
5) The sequence.

Any ideas on why the query didn't work or a better way to get the GIs
than just searching by hand?

Thanks,
Josh Lauricha








More information about the Bioperl-l mailing list