[Dynamite] SingleModel

Ian Holmes ihh@fruitfly.org
Sun, 5 Mar 2000 12:47:27 -0800 (PST)


On Mon, 6 Mar 2000, Ewan Birney wrote:

> > 	interface SingleTransitionParameters {
> > 	  float                  transition_probability;
> > 	  Alphabet::WeightVector emission_probability;
> 
> Grrrrr. <minor> Can't we call this Alphabet::ProbabilityVector? What's
> wrong with sensible names?

That's fine by me. (The "WeightVector" nomenclature was chosen because my
use of the '*' symbol to denote "wildcard" meant having a sum over
residues that was not equal to 1, but since we have abandoned that...)

> > 	interface SingleModelParameters {
> > 	  SingleTransitionParameters           get_parameters (in Transition t);
> > 	  sequence<SingleTransitionParameters> all_parameters();
> > 	// possibly also:
> > 	//  sequence<SingleTransitionParameters> outgoing_parameters (in State s);
> > 	}
> 
> Ok. I think I see this. Basically I like this.
> 
> > 
> > The easiest way to keep the parameters & the model in sync is to stipulate
> > that the sequence<SingleTransitionParameters> returned by
> > SingleModelParameters::all_parameters() is indexed by the same index as
> > the sequence<Transition> returned by the Model::all_transitions() method.
> > Ditto the outgoing_parameters() method -- if you see what I mean.

In other words (just to make this more explicit);

The parameters for the following transition:    Model.all_transitions() [i]
are given by:					ModelParameters.all_parameters() [i]

The parameters for the following transition:	Model.outgoing_transitions(s) [j]
are given by:					ModelParameters.outgoing_parameters(s) [j]

> > I regard this as perfectly valid coding practise as long as it's WELL
> > documented. We could even incoporate a sanity check, by having a
> > "Transition* my_transition" field in SingleTransitionParameters.
> > 
> 
> More worried about growing/shrinking the model. Perhaps that is a later
> thing to think about.

If you're growing or shrinking the model AFTER having decided on the
parameters, then that comes under the heading of "run-time model editing"
in my book. Which we might well want to deal with at some point -- but I
think that run-time editing necessarily involves specifying a remapping of
states & transitions, so this is not a problem.

> > The next option for keeping the model & parameters in sync is to use the
> > get_parameters(Transition) method in SingleModelParameters. If we don't
> > use a ParameterisedModelMemento pattern (as I suggested above), then we
> > care quite a lot about how fast these lookups are. We would have several
> > implementation options:
> 
> 
> I don't see this ParameterisedModelMemento pattern ... is this the 
> parallel arrays you are suggesting above?

No -- my idea of a ParameterisedModelMemento is basically the same as your
original idea of having the parameters in the same object as the model. A
sort of internal object, that would be instantiated just prior to doing
the actual DP -- so you only have to do all the Transition->Parameter
lookups once.

> > 	(0) Linear array of Transitions, searchable by brute force in
> >             time O(M) where M is the model size
> > 	(1) Sorted array of Transitions, searchable by binary chop in time
> >             O(log M)
> > 	(2) Large M*M array (so, O(M^2) memory)
> > 	(3) Some kind of hashing on Transitions
> > 
> > This may seem like a lot of effort (maybe this is what you meant by mad
> > gymnastics -- I've got used to the STL doing all these algorithms for
> > you!).
> > 
> 
> Gymnastics not this but the parallel array stuff. Ok as long as it is
> documented. Sort of gives me the heebie-jeebies though.

I can understand that...
One (slight) improvement would be for each Transition to have an integer
field called transition_index. Then the model could have an
allocate_transition_indices() method. An extra level of safety...

> > is I don't think states have any useful internal data (except their name),
> > and the only thing you need to be able to DO with them is to test whether
> > State s1 == State s2.
> > 
> 
> If the parameters are somewhere else, then yes.
> 
> But will we want to inheriet off them for things like draw-able models.
> (position, colour etc). I don't see what we loose by making them
> objects...

Still unsure about this but it doesn't sound hard to change later on.

> > > force of habit: Most (all?) IDL compilers bitch about using sequence<X>
> > > outside of a typedef. Annoying eh?
> > 
> > yes indeed... oh well.
> > Perhaps call them XSequence to be consistent?
> > 
> 
> Name clash
> 
> 	Sequence -> biological polymer
> 	Sequence -> sequence in IDL
> 
> We can't call them XSequence. It will confuse the fuck out of joe
> bioinformatics guy. Has to be List or... Vector or something...

:-(
List is probably better..
Ian