[Biojava-dev] Extensions to DP framework to permit 2-head training

Matthew Pocock matthew_pocock at yahoo.co.uk
Mon Jan 12 12:51:21 EST 2004


I'm happy - as long as old user-land code works. I don't consider 
implementing a trainer to be a common user-land thing.

Matthew

David Huen wrote:

>I have written a working Viterbi trainer that is capable of 1 and 2 head 
>training and hope to commit to CVS. The current API does not permit 2-head 
>training and will need changes to accomodate this.
>
>I propose the following changes to permit it:-
>
>a) introduction of a TrainingSet interface: In 2-head training, pairs of 
>sequences need to be available.  I have generalised it into a means of 
>supplying n sequences per case so we can deal with n-head training when 20 
>gazillion Hz processors with 5 bazillion byte RAM become available.
>
>public interface TrainingSet
>{
>    public interface Iterator
>    {
>        /**
>         * get next group of sequences to train the model on.
>         */
>        public Sequence[] next();
>
>        /**
>         * any further training sequence groups?
>         */
>        public boolean hasNext();
>    }
>
>    /**
>     * get an iterator for the cases supplied by this TrainingSet.
>     */
>    public Iterator getCases();
>}
>
>b) changes to TrainingAlgorithm interface:-
>
>The current train method takes a SequenceDB which only works for 1-head 
>training.  I propose a further method that takes a TrainingSet and 
>deprecating the current method.
>
>This change will break code that derives from AbstractTrainer.  But I think 
>I can cursorily patch up the AbstractTrainer and current BaumWelch code to 
>add the new call.  I do not propose extending the BW code to handling 2-D 
>training at this stage.
>
>c) changes to AbstractTrainer class:-
>AbstractTrainer supplies a 
>
>protected abstract double singleSequenceIteration(ModelTrainer trainer, 
>SymbolList symList)
>
>I propose changing it to:-
>protected abstract double singleSequenceIteration(ModelTrainer trainer, 
>SymbolList [] symList)
>
>Perhaps to avoid breaking too much, I should create a NewAbstractTrainer 
>class with the new method and derive my ViterbiTrainer from it instead
>
>Comments are requested.
>
>Regards,
>David Huen
>
>_______________________________________________
>biojava-dev mailing list
>biojava-dev at biojava.org
>http://biojava.org/mailman/listinfo/biojava-dev
>
>  
>




More information about the biojava-dev mailing list