[Dynamite] SingleModel
Ewan Birney
birney@ebi.ac.uk
Mon, 6 Mar 2000 03:58:29 +0000 (GMT)
> Probably should have separate modules. I don't quite understand this
> module thing yet.
It really means "what code can I validly replace without having to
recompile the other code". It also means "which things can I put
into a separate CORBA server". Or - which things can I put into a library.
I suspect we'll have -
module for sequences/ and related things
module for viterbi algorithms, alignments and other stuff
(uses sequence module)
module for training
(uses sequence module) - NB - yet again the bugbear
of the alignment datastructure/non datastructure will come up. Argues
for putting this inside the viterbi module...
module for database searching
>
> In fact.. now I think about it -- it seems fascistic to require that
> things have to be in the same module to share internal representation. Not
> to mention unworkable - surely any particular implementation of the
> Dynamite IDL can do all the internal code-sharing it wants, e.g. by just
> #including "super_generic_DP.h" for example?
>
Not sure what you mean by this point...
> I'm not really sure what's going on here, whether we're talking about a
> coding style or a set of strict rules or what...
>
The only thing is for us to be aware of what compile-time dependencies
there are in the code. Is not a "rule" issue more an awarness issue...
> >
> > module SingleModel { // Single means emits only one sequence
> >
> > interface State;
> > interface Transition;
> >
> > typedef sequence<float> ProbabilityEmission;
>
> == Alphabet::WeightVector.
> No need to duplicate this.
ok.
>
> >
> > interface Transition {
> > State from;
> > State to;
> > float transition_probability;
> > ProbabilityEmission emission; // emission on the transitions.
>
> As I said before I think parameters should be in a separate object.
> If people don't like this then we could consider having a kind of memo
> object that represents a parameterised model.
How do we put the parameters elsewhere? Can you show me the IDL?
How do we keep the parameters in sync with the model? (without mad
gymnastics)
>
> Possibly have a "boolean Transition::is_null" field for null transitions?
null meaning does not emit things?
>
> What about "fanned" transitions (i.e. if it's an "A" then go to state 1,
> if it's a "G" then go to state 3, if it's a "C" go to state 4 etc)?
> These can be inefficiently implemented just by having A times as many
> Transitions (where A is the alphabet size) -- shall we just leave it at
> this for now? (I think we probably should.)
Lets leave this...
>
> > };
> >
> > typedef sequence<Transition> TransitionList;
> >
> > interface State {
> > TransitionList all_Transitions();
> > };
>
> I think this method belongs in the model, not in an individual State.
> i.e. Model should have the following methods:
>
> sequence<Transition> outgoing_transitions (in State s);
> sequence<Transition> all_transitions();
>
> > typedef sequence<State> StateList;
>
> I also think "SingleModel::State" should be a typedef to int, not an
> interface of its own. States are lightweight things that are usually
> treated as ints anyway.
I don't understand here why we don't make this an object. We want to
add/remove these things don't we? When we do the DP I imagine the first
thing we do is make a little internal datastructure to drive the generic
DP off. I don't think trying to make the objects look like what we
drive the low-level algorthmical code off will help us...
>
> This breaks down if we need to add too much information to a State.
> However, with the parameters elsewhere, all we need is a name:
>
> string state_name (in State s);
>
> Incidentally (personal gripe) I dislike typedefs like the above one
> ("typedef sequence<State> StateList"). A sequence of States *IS* a "State
> List" by *definition*, there is no need to typedef it; it looks like we
> are generic-programming novices if we do. A typedef is acceptable (IMHO)
> when it denotes a specialised *kind* of list, e.g.
>
> typedef sequence<State> AlignmentPath;
> typedef sequence<float> ProbabilityEmission;
force of habit: Most (all?) IDL compilers bitch about using sequence<X>
outside of a typedef. Annoying eh?
>
> I guess if we're using sequence<IncrediblyLongAndHardToTypeClassName> a
> lot, then we might want to typedef it, but we should probably just choose
> a shorter name for the class in the first place.
>
> >
> > interface Model {
> > StateList all_States();
> > };
>
> To summarise the above edits:
>
> interface Model {
> sequence<State> all_states();
> sequence<Transition> outgoing_transitions (in State s);
> sequence<Transition> all_transitions();
> string state_name (in State s);
> };
>
> >
> > //
> > // Have not done alignment yet
> > //
> >
> > interface AlignmentFactory {
> > attribute model Model;
> > // also here a function pointer for compile-time function for this model
> > Alignment make_alignment(in Seq seq);
>
> I am not convinced that AlignmentFactory is a useful generalisation -- I
> feel that ViterbiAlgorithm makes more sense.
That is sensible....
>
> Ian
>
> >
> > // can throw exceptions/errors of bad alphabet, other things...
> > };
> >
> >
> > }
> >
>
>
> --
> Ian Holmes .... Howard Hughes Medical Institute .... ihh@fruitfly.org
>
>
> _______________________________________________
> Dynamite mailing list - Dynamite@bioperl.org
> http://www.bioperl.org/mailman/listinfo/dynamite
>