[BioRuby] Alignment plugin
biopython at maubp.freeserve.co.uk
Mon Apr 26 17:56:26 UTC 2010
On Mon, Apr 26, 2010 at 6:25 PM, Rutger Vos <rutgeraldo at gmail.com> wrote:
> What you describe below is not what I meant, though it's also very
> important w.r.t. preserving the provenance of annotations. We're
> thinking in a number of different directions and so the requirements
> are starting to creep in :-)
> What I meant is, by long-winded example, the following: imagine you're
> studying the phylogeny of lemurs, and you want to look at
> morphological and behavioral characters. Here's what a character state
> matrix might look like:
> D._madagascariensis - 0
> H._aureus 4 1
> H._simus 6 ?
> H._griseus ? 1
> The first column captures the number of teeth in the lower-jaw
> toothcomb. Some lemurs use the incisors of the lower jaw as a grooming
> apparatus, and they have (I believe) either 4 or 6 teeth in that
> "comb". D._madagascariensis does not have this apparatus at all, so
> its state for this column could be coded as "-", conceptually a bit
> like a gap in an alignment, interpreted as "does not apply".
Or perhaps as a zero?
> To some extent, a matrix with such characters would be like an
> alignment, and in many cases you would analyze this data using the
> same tools for phylogenetic inference, like paup, phylip, mrbayes,
> etc. Also, the same data formats (nexus/nexml, phylip) describe both
> these matrices and alignments.
In these file format, am I right in thinking the non-sequence based
characteristics are all still encoded by single letters? e.g. single
digits. If so, that still allows the data to be held as simple strings.
More information about the BioRuby