[Bioperl-l] First commit of Bio::Structure objects

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Fri, 16 Nov 2001 13:16:13 -0800 (PST)


On Fri, 16 Nov 2001, Kris Boulez wrote:

> Quoting Chris Mungall (cjm@fruitfly.bdgp.berkeley.edu):
> > Hi Kris
> > 
> > The object model design looks very sound. However, I noticed that there
> > are cycles in the object graph (eg bidirectional links between Atom and
> > Residue).
> > 
> I see your point :(
> So it shows this is my first real set of objects I design.

Well, your objects seem conceptually perfect to me, you've just hit a
nasty operational consideration.

> > This causes problems for perl, as the garbage collector gets confused
> > about reference counts and won't clean up properly. (IMHO this is part of
> > a larger problem with object oriented design as a whole, as attributes
> > aren't first class entities in their own right)
> > 
> > This won't be a problem for reading in a few PDB records, but, say,
> > cycling through all of PDB will cause your memory usage to go up and up. I
> > learned this lesson the hard way with my first perl object model, after
> > doing java object models, it was a painful business going back and
> > refactoring the code :@(
> > 
> > One way around this is to keep your own reference counts override the
> > DESTROY method to make sure everything is cleared - this can be tricky.
> > 
> If the assumption is, that if an object is destroyed all its children
> (everything underneath it) are destroyed, this should be doable.

unfortunately it's a bit trickier than that; it's a while since I thought
about all this (the last time I did it made my head hurt), but I'm pretty
sure that you have to also force the client code to explicitly free the
objects once they are no longer in use, a la C coding. This is a bit of a
burden for the potential users of your objects.

If you force the users of your objects to go through a Factory for
obtaining objects, this can be mitigated.

Maybe perl6 will sort this out?
 
> > Another way would be to have everything go through a singleton
> > ProteinData object which would hold all parent/child relationships. The
> > Residue/Atom object would be unaware of their reciprocal links, the client
> > code would have to ask the ProteinData object for this.
> > 
> > I notice you haven't checked in your StructureI object yet, so I can't run
> > the tests - I may be missing something
> > 
> The StructureI object hasn't been checked in, as it doesn't exist yet.
> This was one of my questions in my mail: should I go for one StructureI
> or for seperate EntryI, ChainI,... files. Also: should every public
> method have an Interface definition ?

I'll leave this one to the other bioperlers - I'd say just one StructureI
for now, if in the future there are other implementations of these objects
it shouldnt be too hard to add interfaces in then.

> I ran the tests as follows
> 
>   % cd bioperl-live/t
>   % perl -I.. ./Structure.t
> 
> Kris,
> -- 
> Kris Boulez 				Tel: +32-9-241.11.00
> AlgoNomics NV 				Fax: +32-9-241.11.02
> Technologiepark 4 			email: kris.boulez@algonomics.com
> B 9052 Zwijnaarde 			http://www.algonomics.com/
>