[Biojava-l] Generalized HMM in biojava?
wendy wong
wendy.wong at gmail.com
Fri Jan 20 17:11:58 EST 2006
Thanks for your help!
> It's not the alphabet that will kill you, but the number of parameters you are
> estimating. Indeed, BioJava should be able to handle alphabets with more than
> 2^32 symbols quite happily. There's an implementation of cross-product
> alphabet designed especially for this case.
what I am trying to do is to develop a phylogenetic HMM. so say there
are 3 sequences, in the alignment, that means each site consists of 3
symbols, and if it is a generalized HMM, each state has several sites,
say 7. I wrote a testing program to see if it works. when the length
of sites in the state = 5 it worked. (I just want to see if I can
factorize a symbol in the state alphabet. but when number of sites in
the state = 7, I get java.lang.ArrayIndexOutOfBoundsException. (code
attached)
Is it because i was not using the alphabet efficiently?
again, thanks very much for helping!
Wendy
public static void main(String[] args) throws MarshalException,
ValidationException, IOException {
Alphabet sequenceAlphabet = DNATools.getDNA();
Set alphabetSet = AlphabetManager.getAllSymbols((FiniteAlphabet)
sequenceAlphabet);
int no_sequences = 3;
List siteAlphabetList = Collections.nCopies(no_sequences, sequenceAlphabet);
Alphabet siteAlphabet =
AlphabetManager.getCrossProductAlphabet(siteAlphabetList);
int length = 7;
List staeAlphabetList = Collections.nCopies(length, siteAlphabet);
Alphabet stateAlphabet =
AlphabetManager.getCrossProductAlphabet(staeAlphabetList);
AlphabetIndex alphabetIndex =
AlphabetManager.getAlphabetIndex((FiniteAlphabet) stateAlphabet);
AtomicSymbol sym = (AtomicSymbol) alphabetIndex.symbolForIndex(3);
List symList = sym.getSymbols();
log.info("sym (index=3) is " + sym);
log.info("sym is composed of:");
Iterator symIter = symList.iterator();
while (symIter.hasNext()) {
log.info(symIter.next());
}
}
More information about the Biojava-l
mailing list