[Bioperl-l] SeqIO profiling

Hilmar Lapp hlapp@gnf.org
Mon, 16 Sep 2002 15:08:19 -0700


What surprises me is the high number of BEGIN calls. Me being 
surprised reflects my unwillingness to read perl-guts docs. I 
thought the BEGIN block is only executed when the module is 
_loaded_, which supposedly happens only once during execution.

Evidently this is not true. Does anyone put into BEGIN blocks code 
that should be executed whenever an object is instantiated or 
something?

Aaron, if you replaced BEGIN by INIT in Root.pm and RootI.pm, would 
that save any significant time?

	-hilmar

On Thursday, September 12, 2002, at 12:40 PM, Aaron J Mackey wrote:

>
> here's some more fuel for the fire; this is a profiling run from a
> super-simple script that invokes a SeqIO::fasta on a 24K sequence 
> protein
> database, and does nothing else but call next_seq ... 24K calls to
> next_seq, but 120K calls to _rearrange and BEGIN!!
>
> Total Elapsed Time = 30.84184 Seconds
>          User Time = 30.51116 Seconds
> Exclusive Times
> %Time ExclSec CumulS #Calls sec/call Csec/c  Name
>  23.0   7.040  6.980 119910   0.0001 0.0001  
> Bio::Root::RootI::_rearrange
>  11.6   3.539  5.530 119914   0.0000 0.0000  Bio::Root::RootI::BEGIN
>  9.96   3.039 30.811  23982   0.0001 0.0013  
> Bio::SeqIO::fasta::next_seq
>  8.81   2.689 12.976  47969   0.0001 0.0003  Bio::PrimarySeq::BEGIN
>  5.67   1.730  1.658 143892   0.0000 0.0000  Bio::Root::Root::verbose
>  5.21   1.589 20.041  23981   0.0001 0.0008  Bio::Seq::new
>  3.77   1.150  3.434  47962   0.0000 0.0001  Bio::PrimarySeq::seq
>  3.02   0.920  5.134  47962   0.0000 0.0001  Bio::SeqIO::alphabet
>  2.92   0.890  2.454  23981   0.0000 0.0001  
> Bio::Annotation::TypeManager::new
>  2.88   0.880  4.514  23981   0.0000 0.0002  
> Bio::Annotation::Collection::new
>  2.75   0.840  0.792  95924   0.0000 0.0000  Bio::PrimarySeq::alphabet
>  2.72   0.830  0.850 119917   0.0000 0.0000  Bio::Root::Root::BEGIN
>  2.65   0.810 21.299  23981   0.0000 0.0009  
> Bio::Seq::SeqFactory::create
>  2.52   0.770  0.734  71943   0.0000 0.0000  
> Bio::Annotation::TypeManager::_add
>                                              _type_map
>  2.43   0.740  1.336  23981   0.0000 0.0001  
> Bio::PrimarySeq::_guess_alphabet
>
> here's what the call trace looks like, for each $io->next_seq call:
>
> Bio::SeqIO::fasta::next_seq
>    Bio::Root::IO::_readline
>       File::Temp::BEGIN
>    Bio::SeqIO::alphabet
>    Bio::SeqIO::sequence_factory
>    Bio::Seq::SeqFactory::create
>       Bio::Seq::SeqFactory::type
>       Bio::Root::Root::verbose
>       Bio::Seq::new
>          Bio::Root::RootI::new
>             Bio::Root::Root::BEGIN
>             Bio::Root::Root::verbose
>          Bio::PrimarySeq::BEGIN
>             Bio::Root::RootI::new
>                Bio::Root::Root::BEGIN
>                Bio::Root::Root::verbose
>             Bio::Root::RootI::_rearrange
>             Bio::PrimarySeq::seq
>                Bio::Root::RootI::_rearrange
>                Bio::PrimarySeq::validate_seq
>                Bio::PrimarySeq::alphabet
>                Bio::PrimarySeq::_guess_alphabet
>                   Bio::PrimarySeq::seq
>                      Bio::Root::RootI::_rearrange
>                   Bio::PrimarySeq::alphabet
>             Bio::PrimarySeq::display_id
>             Bio::PrimarySeq::primary_id
>             Bio::PrimarySeq::desc
>          Bio::Root::RootI::_rearrange
>          Bio::Seq::primary_id
>          Bio::Annotation::Collection::new
>             Bio::Root::RootI::new
>                Bio::Root::Root::BEGIN
>                Bio::Root::Root::verbose
>             Bio::Annotation::TypeManager::new
>                Bio::Root::RootI::new
>                   Bio::Root::Root::BEGIN
>                   Bio::Root::Root::verbose
>                Bio::Annotation::TypeManager::_add_type_map
>                Bio::Annotation::TypeManager::_add_type_map
>                Bio::Annotation::TypeManager::_add_type_map
>             Bio::Annotation::Collection::_typemap
>          Bio::Seq::annotation
>          Bio::Seq::primary_seq
>             UNIVERSAL::isa
>             Bio::Seq::top_SeqFeatures
>    Bio::Seq::alphabet
>       Bio::Seq::primary_seq
>       Bio::PrimarySeq::alphabet
>    Bio::SeqIO::alphabet
>       Bio::PrimarySeq::BEGIN
>          Bio::Root::RootI::new
>             Bio::Root::Root::BEGIN
>             Bio::Root::Root::verbose
>          Bio::Root::RootI::_rearrange
>          Bio::PrimarySeq::alphabet
>       Bio::Root::RootI::DESTROY
>          Bio::Root::Root::_cleanup_methods
> Bio::Seq::DESTROY
> Bio::Root::RootI::DESTROY
>    Bio::Root::Root::_cleanup_methods
> Bio::Root::RootI::DESTROY
>    Bio::Root::Root::_cleanup_methods
> Bio::Root::RootI::DESTROY
>    Bio::Root::Root::_cleanup_methods
>
>
> There's a fair amount of handling in Bio::Root::Root BEGIN that
> involves setting up Error usage; a $DONT_USE_ERROR in my test script
> improves things a little (down to 25 seconds instead of 30), but 
> doesn't
> change much of the profile percentages.
>
> -Aaron
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------