[Bioperl-l] Re: Automatic generation of set and get methods

Chris Mungall cjm@fruitfly.org
Fri, 15 Nov 2002 11:25:02 -0800 (PST)


Where is the evidence that autoload leads to problems sooner or later?
sure some hacked together code with a bit of crazy autoload magic tossed
in at the last moment is going to drive you insane, but it seems to me
that autoload working off some well designed schema language will
naturally lead to better code.

here are some of the advantages:

* typing - how many of the current hand-coded methods do $arg->isa()
checks to check for type? strong typing is considered good according to
the prevailing s/w engineering paradigm. are people seriously advocating
hand-coding explicit argument checks?

* invariants - really just an extension of typing. for example, making
sure start < end, 0 <= phase <=2, checking that sequence residues make
sense, etc. These invariants would be specified in the bioperl schema
(which the autoloads or autogenerated funcs would read from) making the
semantics of the bioperl object model *much* easier to understand.

* interface checking - does anyone really like the ModuleI way of doing
things?

* javadoc style document generation. I really love pod for quick
documentation, but it wasn't really meant for documenting bioperl style
object models. even with a battery of emacs macros or functions, surely we
can all agree that writing OO pod docs is totally bonkers?

* better language independence - cleaner seperation of data from
behaviour, easier to roundtrip bioperl to a native xml format, perhaps
even the possibility of sharing the same schema between open-bio projects?

* better OO mechanisms - for instance, allowing mixins (see Nat Goodman's
previous email)

perl is a great language - use it! perl that slavishly mimics java is
going to be an even worse language than java - and that's a bad language.
everyone who is anti-autoload/func sounds like they'd be happier
programming in a java straitjacket.

given that auto generated functions or autoloaded methods working off of a
sensible schema language is obviously the way to go, what schema language
do we use?

IDL? Nah, no need for struct/interface distinction. besides, it's dead.

UML? God, please, no.

Class::Contract - Damian Conway's module. Very nice, eiffel inspired
(better to imitate a well designed OO language than a mediocre one like
java)

Some ontology language (eg OWL). Very nice for modeling but not meant for
software engineering. eg no concept of interfaces.

XML schema - has some advantages.

relational - doesn't give well with OO

Personally I think we should roll our own. I think it needs to be
extensible, so we can start off with basic class + attribute lists:

interface PrimarySeq extends RootI {
  seq;
  length;
}

then extend it with types:

interface PrimarySeq extends RootI {
  string seq;
  int length;
}

then extend it with invariants:

interface PrimarySeq extends RootI {
  string seq
    CHECK lang=perl {!tr/a-zA-Z\*//cd};
  int length
    CHECK lang=perl {!defined $self->seq || $_=length($self->seq)};
}

(of course the invariants could be dropped to avoid the performance drop)

I also think incorporating elements of ontology languages / description
logics is a nice idea too, such as restrictions. For example, an object
$sf of class Bio::SeqFeature could know it also belonged to the class
Bio::SeqFeature::Gene based on the value of the $sf->type() attribute.

Another sensible feature of description logics is that associations are
entities in their own right.

for instance, in an OO language you generally have an object graph of
directed arcs A->B (for instance $seq_feature->seq). with complex object
models, it becomes a pain to keep the graph non-cyclical. this will lead
to memory leaks in perl, and even if this weren't the case you have to
have an extra infrastructure of code to make sure that your A<->B
bidirectional links are consistent.

With a DL you can declare inverse properties, which means if you have A->B
you can have B->A too

This is easy to implement without memory leaks in perl (you do pay a price
in performance though)

Maybe this is over the top, but I really think we need a better way of
specifying the bioperl object model if it's to make serious inroads into
biological modeling

On Fri, 15 Nov 2002, Ewan Birney wrote:

> On Fri, 15 Nov 2002, Heikki Lehvaslaiho wrote:
>
> > Hilmar is here restateing a long standing opinion of the core group that
> > autogenerating methods tends to lead into problems sooner or later in
> > the life cycle of modules and that their use is therefore strongly
> > discouraged.
>
> Or at least stick-in-the-muds like me and Hilmar and Heikki. I suspect
> Linclon is ok with it.
>
>
> This is a "to taste" feature of Perl, and I guess we just have to decide.
> I don't like it because it (in my experience) it produces hard-to-find
> bugs which can remain dormant for a while, but... again, I think it is
> just about a general feel of what we would like to do.
>
>
> I certainly don't see that much sense in destablising massive amount of
> existing code (and *definitely* not as we get close to 1.2).
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>