Bioperl: Start of alignment debate...

Fri, 22 Jan 1999 12:01:06 -0600

A couple of thoughts on the alignment discussion (and referring to Ewan's 
overview http://bio.perl.org/Projects/SeqAlign/overview.html),

1) the object should be in able to handle both pairwise alignments and 
multiple alignments seamlessly (I think this is part of the Ewans overview, 
but it is not stated explicitly).

2) many multiple alignments are associated with an evolutionary or other 
tree structure, and the scoring of the alignment cannot be understood 
without reference to this tree.  It is therefore important that the 
alignment object be able to represent a hierarchical view of multiple 
sequence set.  In some cases this may be rooted in others not.  Hierarchies 
can be handled as alignment of alignments, but it is important to retain 
things like edge lengths.

3) not all alignments are equally confident at all locations.  The 
alignment object should be able to represent confidence measures both 
pairwise and multiple alignments.

4) the alignment object should be able to output a normalized 
representation of the alignment, even if the alignment itself is composed 
of one or more alignment objects in addition to sequence data, and 
independent of whatever internal data representation is used.

5) Ewan includes the goal "Alignments must be able to handle more than one 
residue aligned to (potentially more than one residue) in another 
sequence".  If this means simply that a region in one sequence is aligned 
to a region in another sequence, OK.  But if you mean a dotplot, that seems 
to me to be a different object altogether.  One of the fundamental uses of 
alignment is to map from one sequence to another, and a dotplot does not 
allow you to do this without first extracting an alignment.

I guess that I have to come down against putting too much effort into 
editability for a couple of pragmatic reasons.  The first is that 
implementing a static alignment object is going to be more than enough 
work.  Second, there are some issues that arise if you edit an alignment 
that itself is part of another alignment.  While you might devise rules 
that would allow you to propagate changes up-and-down hierarchy, the 
changes themselves might affect the way that the hierarchy was constructed. 
 Thus editing one component of alignment might invalidate the alignment as 
a whole.  Finally, I don't see that much application for editing as opposed 
constructing alignments.  In almost all computational applications an edit 
is really regeneration of the alignment.  I guess there are groups that 
maintain hand alignments used in phylogeny, but they have already developed 
multiple alignment editing tools so I don't think bioperl needs to support 
this application.

David

----
David J. States, M.D., Ph.D.
Associate Professor and Director
Institute for Biomedical Computing
Washington University in St. Louis
700 S. Euclid Ave.
St. Louis, MO   63110

tel: 314 362 2134
fax: 314 362 0234
email: states@ibc.wustl.edu

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================