[Bioperl-l] Bio::DB::GenBank batch retrieval question
CHALFANT_CHRIS_M@Lilly.com
CHALFANT_CHRIS_M@Lilly.com
Tue, 23 Apr 2002 13:57:23 -0500
When I use Bio::DB::GenBank::get_Stream_by_acc to retrieve a set of
accession numbers and include an invalid accession number in with several
valid accession numbers, I get no sequences back from Entrez. For
example, the code below returns no output (though I get output if I remove
the "bogus" accession).
Is this the expected behavior or am I using the code incorrectly? If this
is the correct behavior, how would you suggest requesting a batch of
genbank records for a list which may include invalid (or missing)
accession numbers? I am considering a "divide-and-conquer strategy":
spliting the list in half and recursively requesting each half until I
find the offending ID, but I am really trying to minimize the HTTP
requests.
As an alternative, I considered using Bio::DB::EMBL, but this module seems
to throw an exception ("MSG: EMBL stream with no ID. Not embl in my book")
if the list includes invalid accessions.
CODE:
my @accessions = qw(AB000095 AB000220 bogus);
my $gb = new Bio::DB::GenBank;
my $seqio = $gb->get_Stream_by_acc(\@accessions);
while (my $record = $seqio->next_seq) {
print $record->primary_id, "\n";
}
Chris