Bioperl: Start of alignment debate...
David J. States
states@ibc.wustl.edu" <states@gpc.ibc.wustl.edu
Fri, 22 Jan 1999 12:01:06 -0600
A couple of thoughts on the alignment discussion (and referring to Ewan's
overview http://bio.perl.org/Projects/SeqAlign/overview.html),
1) the object should be in able to handle both pairwise alignments and
multiple alignments seamlessly (I think this is part of the Ewans overview,
but it is not stated explicitly).
2) many multiple alignments are associated with an evolutionary or other
tree structure, and the scoring of the alignment cannot be understood
without reference to this tree. It is therefore important that the
alignment object be able to represent a hierarchical view of multiple
sequence set. In some cases this may be rooted in others not. Hierarchies
can be handled as alignment of alignments, but it is important to retain
things like edge lengths.
3) not all alignments are equally confident at all locations. The
alignment object should be able to represent confidence measures both
pairwise and multiple alignments.
4) the alignment object should be able to output a normalized
representation of the alignment, even if the alignment itself is composed
of one or more alignment objects in addition to sequence data, and
independent of whatever internal data representation is used.
5) Ewan includes the goal "Alignments must be able to handle more than one
residue aligned to (potentially more than one residue) in another
sequence". If this means simply that a region in one sequence is aligned
to a region in another sequence, OK. But if you mean a dotplot, that seems
to me to be a different object altogether. One of the fundamental uses of
alignment is to map from one sequence to another, and a dotplot does not
allow you to do this without first extracting an alignment.
I guess that I have to come down against putting too much effort into
editability for a couple of pragmatic reasons. The first is that
implementing a static alignment object is going to be more than enough
work. Second, there are some issues that arise if you edit an alignment
that itself is part of another alignment. While you might devise rules
that would allow you to propagate changes up-and-down hierarchy, the
changes themselves might affect the way that the hierarchy was constructed.
Thus editing one component of alignment might invalidate the alignment as
a whole. Finally, I don't see that much application for editing as opposed
constructing alignments. In almost all computational applications an edit
is really regeneration of the alignment. I guess there are groups that
maintain hand alignments used in phylogeny, but they have already developed
multiple alignment editing tools so I don't think bioperl needs to support
this application.
David
----
David J. States, M.D., Ph.D.
Associate Professor and Director
Institute for Biomedical Computing
Washington University in St. Louis
700 S. Euclid Ave.
St. Louis, MO 63110
tel: 314 362 2134
fax: 314 362 0234
email: states@ibc.wustl.edu
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================