[Biojava-l] Generalized HMM in biojava?

wendy wong wendy.wong at gmail.com
Mon Jan 23 06:43:43 EST 2006


> OK - so you have a single HMM that emits whole columns of an alignment?
> Usually to a lign three sequences, you would use a 3-head HMM where each head
> emits one of the sequences.

I am not sure if it would work with a 3 head HMM, as in here the
sequences are related to each other by the phylogenetic tree. so if
the sequences order is the same, the column ACC would have a different
likelihood than CCA.

> You shouldn't be getting exceptions. This is almost certainly a bug. Could you
> send the stack-trace?

sure, here it is:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at org.biojava.bio.symbol.LinearAlphabetIndex.buildIndex(LinearAlphabetIndex.java:108)
	at org.biojava.bio.symbol.LinearAlphabetIndex.<init>(LinearAlphabetIndex.java:66)
	at org.biojava.bio.symbol.AlphabetManager.getAlphabetIndex(AlphabetManager.java:1796)
	at edu.cornell.bscb.evopromoter.TestingFunctions.main(TestingFunctions.java:61)

I think I don't need the full alphabet of getDNA(), which has 16
symbols. I reduced it to 5 (A,T, C, G, N), so I can have a state that
contains more sites...

thanks,
wendy

> > again, thanks very much for helping!
> >
> > Wendy
> >
> > public static void main(String[] args) throws MarshalException,
> > ValidationException, IOException {
> >
> >               Alphabet sequenceAlphabet = DNATools.getDNA();
> >               Set alphabetSet = AlphabetManager.getAllSymbols((FiniteAlphabet)
> > sequenceAlphabet);
> >
> >               int no_sequences = 3;
> >               List siteAlphabetList = Collections.nCopies(no_sequences,
> > sequenceAlphabet); Alphabet siteAlphabet =
> > AlphabetManager.getCrossProductAlphabet(siteAlphabetList);
> >           int length = 7;
> >           List staeAlphabetList = Collections.nCopies(length, siteAlphabet);
> >           Alphabet stateAlphabet =
> > AlphabetManager.getCrossProductAlphabet(staeAlphabetList);
> >
> >           AlphabetIndex alphabetIndex =
> > AlphabetManager.getAlphabetIndex((FiniteAlphabet) stateAlphabet);
> >       AtomicSymbol sym = (AtomicSymbol) alphabetIndex.symbolForIndex(3);
> >           List symList = sym.getSymbols();
> >           log.info("sym (index=3)  is " + sym);
> >           log.info("sym is composed of:");
> >           Iterator symIter = symList.iterator();
> >           while (symIter.hasNext()) {
> >                       log.info(symIter.next());
> >           }
> > }
>



More information about the Biojava-l mailing list