[Bioperl-l] Bio::DB::GenBank batch retrieval question

Mick Watson michaelwatson@paradigm-therapeutics.co.uk
Wed, 24 Apr 2002 09:18:03 +0100


Hi Chris

I am not an expert on the code, but to my knowledge the BioPerl code in
question merely sends off a request to the NCBI's entrez system and can only
work with what Entrez sends back.

If you go to entrez
(http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Nucleotide) and enter
all three of the accessions into the box at once then as you will see, entrez
returns no results (if you only enter the two correct accessions you will get
results)

So the problem is not with BioPerl as such, it is merely firing off a query
and returning the (lack of) results.  Bioperl is merely mimicking the
behavior of Entrez

Thanks
Mick

CHALFANT_CHRIS_M@Lilly.com wrote:

> When I use Bio::DB::GenBank::get_Stream_by_acc to retrieve a set of
> accession numbers and include an invalid accession number in with several
> valid accession numbers, I get no sequences back from Entrez.  For
> example, the code below returns no output (though I get output if I remove
> the "bogus" accession).
>
> Is this the expected behavior or am I using the code incorrectly?  If this
> is the correct behavior, how would you suggest requesting a batch of
> genbank records for a list which may include invalid (or missing)
> accession numbers?  I am considering a "divide-and-conquer strategy":
> spliting the list in half and recursively requesting each half until I
> find the offending ID, but I am really trying to minimize the HTTP
> requests.
>
> As an alternative, I considered using Bio::DB::EMBL, but this module seems
> to throw an exception ("MSG: EMBL stream with no ID. Not embl in my book")
> if the list includes invalid accessions.
>
> CODE:
>
> my @accessions = qw(AB000095 AB000220 bogus);
> my $gb = new Bio::DB::GenBank;
> my $seqio = $gb->get_Stream_by_acc(\@accessions);
>
> while (my $record = $seqio->next_seq) {
>   print $record->primary_id, "\n";
> }
>
> Chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l