[Bioperl-l] SeqIO profiling
Hilmar Lapp
hlapp@gnf.org
Mon, 16 Sep 2002 15:08:19 -0700
What surprises me is the high number of BEGIN calls. Me being
surprised reflects my unwillingness to read perl-guts docs. I
thought the BEGIN block is only executed when the module is
_loaded_, which supposedly happens only once during execution.
Evidently this is not true. Does anyone put into BEGIN blocks code
that should be executed whenever an object is instantiated or
something?
Aaron, if you replaced BEGIN by INIT in Root.pm and RootI.pm, would
that save any significant time?
-hilmar
On Thursday, September 12, 2002, at 12:40 PM, Aaron J Mackey wrote:
>
> here's some more fuel for the fire; this is a profiling run from a
> super-simple script that invokes a SeqIO::fasta on a 24K sequence
> protein
> database, and does nothing else but call next_seq ... 24K calls to
> next_seq, but 120K calls to _rearrange and BEGIN!!
>
> Total Elapsed Time = 30.84184 Seconds
> User Time = 30.51116 Seconds
> Exclusive Times
> %Time ExclSec CumulS #Calls sec/call Csec/c Name
> 23.0 7.040 6.980 119910 0.0001 0.0001
> Bio::Root::RootI::_rearrange
> 11.6 3.539 5.530 119914 0.0000 0.0000 Bio::Root::RootI::BEGIN
> 9.96 3.039 30.811 23982 0.0001 0.0013
> Bio::SeqIO::fasta::next_seq
> 8.81 2.689 12.976 47969 0.0001 0.0003 Bio::PrimarySeq::BEGIN
> 5.67 1.730 1.658 143892 0.0000 0.0000 Bio::Root::Root::verbose
> 5.21 1.589 20.041 23981 0.0001 0.0008 Bio::Seq::new
> 3.77 1.150 3.434 47962 0.0000 0.0001 Bio::PrimarySeq::seq
> 3.02 0.920 5.134 47962 0.0000 0.0001 Bio::SeqIO::alphabet
> 2.92 0.890 2.454 23981 0.0000 0.0001
> Bio::Annotation::TypeManager::new
> 2.88 0.880 4.514 23981 0.0000 0.0002
> Bio::Annotation::Collection::new
> 2.75 0.840 0.792 95924 0.0000 0.0000 Bio::PrimarySeq::alphabet
> 2.72 0.830 0.850 119917 0.0000 0.0000 Bio::Root::Root::BEGIN
> 2.65 0.810 21.299 23981 0.0000 0.0009
> Bio::Seq::SeqFactory::create
> 2.52 0.770 0.734 71943 0.0000 0.0000
> Bio::Annotation::TypeManager::_add
> _type_map
> 2.43 0.740 1.336 23981 0.0000 0.0001
> Bio::PrimarySeq::_guess_alphabet
>
> here's what the call trace looks like, for each $io->next_seq call:
>
> Bio::SeqIO::fasta::next_seq
> Bio::Root::IO::_readline
> File::Temp::BEGIN
> Bio::SeqIO::alphabet
> Bio::SeqIO::sequence_factory
> Bio::Seq::SeqFactory::create
> Bio::Seq::SeqFactory::type
> Bio::Root::Root::verbose
> Bio::Seq::new
> Bio::Root::RootI::new
> Bio::Root::Root::BEGIN
> Bio::Root::Root::verbose
> Bio::PrimarySeq::BEGIN
> Bio::Root::RootI::new
> Bio::Root::Root::BEGIN
> Bio::Root::Root::verbose
> Bio::Root::RootI::_rearrange
> Bio::PrimarySeq::seq
> Bio::Root::RootI::_rearrange
> Bio::PrimarySeq::validate_seq
> Bio::PrimarySeq::alphabet
> Bio::PrimarySeq::_guess_alphabet
> Bio::PrimarySeq::seq
> Bio::Root::RootI::_rearrange
> Bio::PrimarySeq::alphabet
> Bio::PrimarySeq::display_id
> Bio::PrimarySeq::primary_id
> Bio::PrimarySeq::desc
> Bio::Root::RootI::_rearrange
> Bio::Seq::primary_id
> Bio::Annotation::Collection::new
> Bio::Root::RootI::new
> Bio::Root::Root::BEGIN
> Bio::Root::Root::verbose
> Bio::Annotation::TypeManager::new
> Bio::Root::RootI::new
> Bio::Root::Root::BEGIN
> Bio::Root::Root::verbose
> Bio::Annotation::TypeManager::_add_type_map
> Bio::Annotation::TypeManager::_add_type_map
> Bio::Annotation::TypeManager::_add_type_map
> Bio::Annotation::Collection::_typemap
> Bio::Seq::annotation
> Bio::Seq::primary_seq
> UNIVERSAL::isa
> Bio::Seq::top_SeqFeatures
> Bio::Seq::alphabet
> Bio::Seq::primary_seq
> Bio::PrimarySeq::alphabet
> Bio::SeqIO::alphabet
> Bio::PrimarySeq::BEGIN
> Bio::Root::RootI::new
> Bio::Root::Root::BEGIN
> Bio::Root::Root::verbose
> Bio::Root::RootI::_rearrange
> Bio::PrimarySeq::alphabet
> Bio::Root::RootI::DESTROY
> Bio::Root::Root::_cleanup_methods
> Bio::Seq::DESTROY
> Bio::Root::RootI::DESTROY
> Bio::Root::Root::_cleanup_methods
> Bio::Root::RootI::DESTROY
> Bio::Root::Root::_cleanup_methods
> Bio::Root::RootI::DESTROY
> Bio::Root::Root::_cleanup_methods
>
>
> There's a fair amount of handling in Bio::Root::Root BEGIN that
> involves setting up Error usage; a $DONT_USE_ERROR in my test script
> improves things a little (down to 25 seconds instead of 30), but
> doesn't
> change much of the profile percentages.
>
> -Aaron
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------