[BioSQL-l] Genbank loading time

Wed Jan 28 16:40:55 UTC 2009

On Wed, Jan 28, 2009 at 4:29 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> I don't think sequence loading via load_seqdatabase.pl uses BioPerl.  If one
> uses BioPerl and bioperl-db the following can explain at least some of the
> reason why loading is slow:
> http://www.bioperl.org/wiki/Why_BioPerl_is_slow
> We also go through the extra hand-wringing with Bio::Species objects
> (something I don't think the other Bio* worry about).

Looking at the source code for the load_seqdatabase.pl script included
with bioperl-db, my impression is it uses Bio::DB::BioDB to talk to
the database, and Bio::SeqIO to parse the input sequence files (in
this case, Bio::SeqIO::genbank is used).  See:

http://code.open-bio.org/svnweb/index.cgi/bioperl/view/bioperl-db/trunk/scripts/biosql/load_seqdatabase.pl

> Regardless, it's not an easy problem to work around.  There are such things
> as Moose, and Perl6 is now in alpha...

I'll take your word for it - I'm in no position to improve anyone's Perl code ;)

Peter