[Bioperl-l] Bio::Seq::GenEMBLI proposal

Ewan Birney birney@ebi.ac.uk
Mon, 18 Dec 2000 09:05:59 +0000 (GMT)


On Sun, 17 Dec 2000, Hilmar Lapp wrote:

> Ewan Birney wrote:
> > 
> > The proposal is an interface called
> > 
> > Bio::Seq::GenEMBLI
> > 
> > and an implementation Bio::Seq::GenEMBL. (interface allows other people to
> > comply with the interface without using the same implementation. A "good
> > thing" tm, in particular for database implementors).
> > 
> 
> Just a remark: why can't I comply with a module's API by just
> subclassing it and overriding all its methods? Why do I need an
> implementation-less interface for this? (I thought the interface-hype
> was initiated to justify Java's disability of multiple inheritance.)


Because in general an implementation has many methods (in particular the
set methods, such as "add_date") than an interface. Interfaces are more
likely to be read only.


Inside Ensembl and other projects, like bioperl-corba-client, we basically
comply to the interfaces and do not override the
implementation. Overriding implmentations in my view


	(a) makes the code less clear (ie, someone has to figure out that
you realy hae overridden each method) and

	(b) gives ample oppertunity for non-intentional screwups when the
implementations change, eg, by adding a function that assummes that it is
implemented as a hash to the implementation can produce a *segmentation
fault* for some implementations (yikes!)

> 
> > At the moment I have just taken what is in the Bio::Seq object and moved
> > it into its own interface, written below.
> > 
> > Decisions:
> > 
> >   (a) should we keep with the each_ syntax, or would people prefer
> > "something returing an array of things" to have a different naming
> > convention?
> > 
> 
> If you ask, I say let's change it to something more commonly used and
> more intuitive for newcomers. I'm not sure, however, that either
> changing all such names to a new naming style, or introducing
> inconsistencies is good. What do people feel about this?
> 
> > 
> >   (b) should date's be formatted strings or something else? (if so, what?)
> > 
> 
> If something structured, it is clear that we will have to parse the date
> ...
> 
> >   (c) should keyword lines be split on keywords and each_keyword methods
> > or not?
> > 
> 
> ? What do you mean?
> 
> >   (d) should the interface extend to cover swissprot, in which case
> > 
> >       - name change?
> > 
> >       - additional methods?
> 
> Hm. If they share a lot, can't we make swissprot inherit from GenEMBL?
> 
> > 
> > Here is what I have so far for this interface definition, waiting to be
> > committed once I get the "ok"
> > 
> 
> Ok, apart from the comments above. Do we really need the interface here?
> 

Absolutely. Ensembl is going to need to hit this interface from the
ContigI interface (ie, Ensembl's Bio::EnsEMBL::DB::ContigI will inheriet
off this). ContigI is implemented inside Ensembl two radically different
ways. We need this to allow Ensembl to use the SeqIO system for
GenBank/EMBL dumping (write_seq on genbank/embl will be smart, and do an
->isa() on the incoing seq to see if it supports this interface).


We definitely need an interface here. In fact, I think we need interfaces
nearly everywhere - it really future proofs the code...


> 	Hilmar
> -- 
> -----------------------------------------------------------------
> Hilmar Lapp                                email: hlapp@gmx.net
> GNF, San Diego, Ca. 92122                  phone: +1 858 812 1757
> -----------------------------------------------------------------
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------