[Bioperl-l] Re: Bio::FeatureHolderI interface confusion
Lincoln Stein
lstein at cshl.edu
Wed Jun 18 19:17:17 EDT 2003
Hi,
Open Source projects are a bit weedy. You have to prune them back regularly
in order to avoid them spreading out of control. The original Bio::Seq*
classes grew out of one use case -- reading and writing flat files -- and
have become rather mangy as they've been repurposed to meet new needs.
I find Ewan's separation of bioperl clients into "users" and "developers" a
little artificial. We'd like to see our customers begin as users and
transition into developers as they gain experience. Well-designed and
documented code serves everyone equally well.
My experience 2 years ago in coming back to bioperl after having been away for
the better part of a decade was not that the interfaces were confusing or too
numerous, but that they were the *wrong* interfaces for my applications. For
example, I wanted to be able to work with sets of sequence features without
necessarily having an instantiated sequence string around. I wanted to be
able to perform coordinate arithmetic with features so as to locate one
feature relative to another. I wanted to be able to produce a graphical
rendering of a feature completely generically. None of this was possible at
the time. So I did the obvious thing, and created subclasses which
implemented the methods I needed, and as an afterthought created interfaces
to describe what I had done. Lo! A new weed was born.
The problem is that everyone has had the same experience, has extended the
library, and has added ad hoc and inconsistent interfaces. Now our garden is
overrun.
The solution is to go back to the use cases, figure out what types of problems
we want the modules to address, and then design a small number of interfaces
that do the job. We should either use Damian Conway's Class::Contract to
enforce our use of the interfaces at compile time, or use Paul's proposed
@ISA tree walker to regression test that each required method is implemented
(although this is harder to do than it looks, and I'd like to see how this
works).
We started an informal redesign process a couple of months ago in a series of
e-mail exchanges with Paul, Ewan, Hilmar and Aaron and I think we made some
good progress towards the outlines of a "Bioperl 2." It hasn't gone very far
since then, and I guess the question is how public a process this type of
redesign should be, and how to manage the various competing needs?
Lincoln
On Wednesday 18 June 2003 03:26 pm, Paul Edlefsen wrote:
> I've been keeping silent on this (didja notice?), but as Ewan predicted, I
> have views here.
>
> The idea of protocols -- per-method contracts -- intrigues me; Perl offers
> the can() facility, which could be used here. I personally have not
> experienced any necessity for this, but I'm willing to believe that y'all
> have.
>
> Perl, as we are painfully aware, does not enforce contracts. In my
> experience as a Java developer and a bioperl developer I have come to
> appreciate the necessity of enforcing contracts; my interpretation of the
> complexity of bioperl -- the near impossibility of treating it as a
> component model -- is that a failure to enforce interface contracts is the
> principal stumbling point.
>
> It has been mentioned in this thread that complex interfaces are devised
> and then ignored. I suspect that the concept of enforcing these interfaces
> is terrifying at first blush: does that mean that I have to actually
> implement all of this crap? There's a couple of points to consider before
> dismissing enforcement, though: 1) if we used interfaces correctly then
> they would not be impossible to implement; 2) interface contracts may allow
> null-responses (eg. if a SeqFeatureI isa FeatureHolderI it does not
> *necessarily* contain subfeatures, but you can ask it how many subfeatures
> it has (it has 0); although FeatureHolderI presently asserts the
> often-un-supportable contract that FeatureHolderI implementers always
> accept the addition and removal of subfeatures, this is not a failure in
> Object Oriented Programming, it is a failure in our FeatureHolderI contract
> design).
>
> I have personally consolodated the FeatureHolderI variants, so I'm pretty
> familiar with this particular area of the bioperl library. I found that
> this contract is duplicated (eg. GFFI, DasI, FeatureHolderI,
> SeqFeature::CollectionI), unused (eg. CollectionI), and ignored (eg.
> gbrowse accesses all feature providers as if they are GFF.pm). On what we
> affectionately refer to as the 'freaky dev branch', branch-1-2-collection,
> I have unified these things into one interface, called
> Bio::SeqFeature::CollectionI (which inherits, for backwards compatability,
> from FeatureHolderI). SeqFeatureI is capable of holding subfeatures, so on
> this branch Bio::SeqFeatureI isa Bio::SeqFeature::CollectionI. I have also
> made a version of gbrowse that uses this interface, as well as a data
> provision interface called Bio::DB::FeatureProviderI, that fetches feature
> collections from a backing store.
>
> I agree that the interfaces presented to novices should be few and simple
> and straightforward. I do not think, though, that the interfaces presented
> to programmers need be otherwise. I am not the best designer of
> interfaces, and those that I have designed might not be the best solution,
> but if we as a community can commit to the concept that an interface is an
> inviolable contract and that use of interfaces is prerequisite to
> component-oriented development, then the failures in an interface will lead
> not to its violation, ignorance, or duplication, but to its correction.
>
> I can see that the culture of bioinformatics software development is
> presently disinclined towards self-enforcement of software design
> contracts. We will not abandon bioperl, though; the only direction forward
> (IMO) is through some sort of refactoring. If we cannot rely on
> contributors to enforce interface contracts, could we perhaps enforce them
> through some software solution (in Java or C++ this is the compiler's job)?
> Like runnable synopses, could we not test *on checkin*, or at least in the
> test suite, that interface contracts are enforced?
>
> If anyone is interested, to this end I have created a very small number of
> initial interface tests, in the ti directory, on the freaky dev branch.
> The ultimate idea is that the ISA hierarchy will be climbed and anything
> claiming to support an interface will have to pass the test corresponding
> to that interface. This is, after all, what ISA means: it is safe to think
> of me as a BLAH. Why not test this?
>
> Okay, thanks for reading my rant.
>
> :Paul
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein at cshl.org Cold Spring Harbor, NY
========================================================================
More information about the Bioperl-l
mailing list