[Bioperl-l] Bio::DB::Query::GenBank
Josh Lauricha
laurichj at bioinfo.ucr.edu
Tue May 6 13:46:35 EDT 2003
Using the following code (From the Bio::DB::Query::GenBank doc):
#!/usr/bin/perl -w
use Bio::DB::GenBank;
use Bio::SeqIO;
use strict;
my $gb = new Bio::DB::GenBank;
my $seqin = new Bio::SeqIO(-format => 'efa');
my $seqout = new Bio::SeqIO(-format => 'efa');
my $seqio = $gb->get_Stream_by_query('Oryza sativa[Organism] AND EST');
while( my $seq = $seqio->next_seq ) {
print "seq length is ", $seq->length,"\n";
}
I get the following error:
Warning(s) from GenBank:
<FieldNotFound>Organism</FieldNotFound>
However, if I goto www.ncbi.nih.gov and type in:
'Oryza sativa[Organism] AND EST'
I get something around 18k hits in both the nucleotide and protien databases.
I need to find more information about some sequences, I hope to find
their gi numbers. However, none of the data I have is specific to the
genes, so my thought was to search for the org name the seq are from,
then to compare the seq from genbank to the ones I have.
I have:
1) An accession number that seems to be the GenBank LOCUS id, which
is not a valid search field.
2) A nick name similar to SWProt's, but not identical.
3) For most, a SWProt accession
4) A description.
5) The sequence.
Any ideas on why the query didn't work or a better way to get the GIs
than just searching by hand?
Thanks,
Josh Lauricha
More information about the Bioperl-l
mailing list