[Bioperl-l] Re: Frameshifts in alignments ... ?
Matthew Pocock
matthew_pocock@yahoo.co.uk
Tue, 03 Sep 2002 15:07:32 +0100
Ewan Birney wrote:
> [snip]
> Remember that the "encoding" is as well as the bases, ie, one effectively
> has two "tracks", being
>
> CCCCCCCCCCCIIIIIIIIIIIIIIIIIIIIIIICCCCCGGGCCCC
> ATGGGTGTATGTATTGTGTAAAAAGAATGTTAAGGTTGT---GTET
Hi.
This is very similar to what the DP package in BioJava spits out. In our
model, each state in an HMM is also a Symbol in an Alphabet instance
(the alphabet of states for that model). When things are aligned to an
HMM, the result is an alignment object with one row for each input
sequence and one row for the state sequence. Since states extend symbol,
they get treated fairly transparently by the APIs. Also, we can use the
alphabet over doubles as another row - the per-column scores can be
added as just another row of info in the alighment. So, IMHO, treating
the state sequence as just another track of symbol information (possibly
with some magical row identifier) is a good thing.
Matthew
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com