[Biojava-l] diploid alphabet

Doug Passey dpassey@u.washington.edu
Mon, 21 Oct 2002 09:20:40 -0700


hi all,
we are faced with the problem of representing heterozygous indels in diploid
resequenced data.  normal heterozygotes (SNPs) in a diploid sequence can be
represented with the various ambiquity symbols, but in my cursory look at
the symbol/alphabet stuff in the biojava API docs, i did not see any way of
representing ambiquities of the form: A/-, C/-, G/-, or T/- ... which are
the four forms of a single base heterozygous indel in diploid data.  is
somebody working on this, and if not, does someone have suggestion about how
to add this to the whole alphabet/symbol scheme of biojava?  i am a relative
novice at biojava; so if i have to implement this, i might need a little
guidance to make sure that it is implemented in the correct way.
doug