[Dynamite] SingleModel
Ian Holmes
ihh@fruitfly.org
Sun, 5 Mar 2000 12:47:27 -0800 (PST)
On Mon, 6 Mar 2000, Ewan Birney wrote:
> > interface SingleTransitionParameters {
> > float transition_probability;
> > Alphabet::WeightVector emission_probability;
>
> Grrrrr. <minor> Can't we call this Alphabet::ProbabilityVector? What's
> wrong with sensible names?
That's fine by me. (The "WeightVector" nomenclature was chosen because my
use of the '*' symbol to denote "wildcard" meant having a sum over
residues that was not equal to 1, but since we have abandoned that...)
> > interface SingleModelParameters {
> > SingleTransitionParameters get_parameters (in Transition t);
> > sequence<SingleTransitionParameters> all_parameters();
> > // possibly also:
> > // sequence<SingleTransitionParameters> outgoing_parameters (in State s);
> > }
>
> Ok. I think I see this. Basically I like this.
>
> >
> > The easiest way to keep the parameters & the model in sync is to stipulate
> > that the sequence<SingleTransitionParameters> returned by
> > SingleModelParameters::all_parameters() is indexed by the same index as
> > the sequence<Transition> returned by the Model::all_transitions() method.
> > Ditto the outgoing_parameters() method -- if you see what I mean.
In other words (just to make this more explicit);
The parameters for the following transition: Model.all_transitions() [i]
are given by: ModelParameters.all_parameters() [i]
The parameters for the following transition: Model.outgoing_transitions(s) [j]
are given by: ModelParameters.outgoing_parameters(s) [j]
> > I regard this as perfectly valid coding practise as long as it's WELL
> > documented. We could even incoporate a sanity check, by having a
> > "Transition* my_transition" field in SingleTransitionParameters.
> >
>
> More worried about growing/shrinking the model. Perhaps that is a later
> thing to think about.
If you're growing or shrinking the model AFTER having decided on the
parameters, then that comes under the heading of "run-time model editing"
in my book. Which we might well want to deal with at some point -- but I
think that run-time editing necessarily involves specifying a remapping of
states & transitions, so this is not a problem.
> > The next option for keeping the model & parameters in sync is to use the
> > get_parameters(Transition) method in SingleModelParameters. If we don't
> > use a ParameterisedModelMemento pattern (as I suggested above), then we
> > care quite a lot about how fast these lookups are. We would have several
> > implementation options:
>
>
> I don't see this ParameterisedModelMemento pattern ... is this the
> parallel arrays you are suggesting above?
No -- my idea of a ParameterisedModelMemento is basically the same as your
original idea of having the parameters in the same object as the model. A
sort of internal object, that would be instantiated just prior to doing
the actual DP -- so you only have to do all the Transition->Parameter
lookups once.
> > (0) Linear array of Transitions, searchable by brute force in
> > time O(M) where M is the model size
> > (1) Sorted array of Transitions, searchable by binary chop in time
> > O(log M)
> > (2) Large M*M array (so, O(M^2) memory)
> > (3) Some kind of hashing on Transitions
> >
> > This may seem like a lot of effort (maybe this is what you meant by mad
> > gymnastics -- I've got used to the STL doing all these algorithms for
> > you!).
> >
>
> Gymnastics not this but the parallel array stuff. Ok as long as it is
> documented. Sort of gives me the heebie-jeebies though.
I can understand that...
One (slight) improvement would be for each Transition to have an integer
field called transition_index. Then the model could have an
allocate_transition_indices() method. An extra level of safety...
> > is I don't think states have any useful internal data (except their name),
> > and the only thing you need to be able to DO with them is to test whether
> > State s1 == State s2.
> >
>
> If the parameters are somewhere else, then yes.
>
> But will we want to inheriet off them for things like draw-able models.
> (position, colour etc). I don't see what we loose by making them
> objects...
Still unsure about this but it doesn't sound hard to change later on.
> > > force of habit: Most (all?) IDL compilers bitch about using sequence<X>
> > > outside of a typedef. Annoying eh?
> >
> > yes indeed... oh well.
> > Perhaps call them XSequence to be consistent?
> >
>
> Name clash
>
> Sequence -> biological polymer
> Sequence -> sequence in IDL
>
> We can't call them XSequence. It will confuse the fuck out of joe
> bioinformatics guy. Has to be List or... Vector or something...
:-(
List is probably better..
Ian