[Bioperl-l] SeqIO parsing
Ewan Birney
birney@ebi.ac.uk
Tue, 24 Sep 2002 17:01:37 +0100 (BST)
I have been getting down into the depths of the parsing, and we are
horrendously slow on the object creation - there are two main reasons:
(a) a somewhat tortorous path of object creation, which *always*
travels through at least three functions to build a blessed hash (before
you have even got to the object-specific parts). I believe this can be
slimmed down by:
(i) Assumming the object's new function is supplied by the
implementation heirarchy, and not the interface, getting rid of the jump
through RootI. RootI's new() would now behave like RootI's
_create_object()
(ii) Remove the _create_object line in Root.pm - assumming that
people who make to make a custom object would inheriet from RootI,
implement ->verbose() and ->new() as they like.
(iii) To prevent henious errors of RootI compliance without verbose
being overriden, put in a default implementation of verbose returning 0
and a warning.
this scheme in my mind has one *SERIOUS* gotcha. People *have* to write
their @ISA's with their implementation tree *first* and their interface
inheritance second. Is this ok with people?
The nice thing about (a) is that it should give speed ups across the
entire system, not just SeqIO.
Jason/Hilmar - is there a hidden gotcha here somewhere?
(b) Making a new Bio::Seq::SeqFactory with privledged access to
functions in Bio::Seq and Bio::PrimarySeq to make fast access objects, eg,
not going through a second alphabet on setting seq.
(b) is SeqIO specific so I want to do this second.
BTW - I think I can cut the object creation time by a factor of 6 in my
tests if I get this written right ;)
-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>.
-----------------------------------------------------------------