Bioperl: Start of alignment debate...
Ewan Birney
birney@sanger.ac.uk
Mon, 25 Jan 1999 15:54:52 +0000 (GMT)
On Mon, 25 Jan 1999, David J. States wrote:
> Actually, blocks style alignments are a case where Ewan's notion of "region
> to region" alignment would be quite useful. The intervening sequences
> between a pair of blocks are regions that are aligned with each other, it
> is just not clear how eaxactly the residues in this region on one sequence
> are are matched to residues in the corresponding region on another
> sequence.
Yes. I agree. In fact there are a number of quite useful features of a
region-to-region alignment concept (eg, aligning repeat positioning but
not of the repeats themselves - quite a curious alignment concept).
>
> But this does raise a significant issue. An alignment is a mapping from
> one sequence to another sequence. To preserve the use of dynamic
> programming, there is an additional constraint that this mapping preserve
> the linear ordering of residues. However, this constraint makes it
> impossible to represent some biological relationships such as domaing order
> swaps and domain duplication. An alignment data structure that makes it
> possible to represent these nonlinear mappings will make it harder to use
> the module in straightforward aligment contexts because you will always
> have to ask is this just a one to one mapping or something funky.
> Exceptional cases do exist in biology, but 99% of cases don't require the
> exception. Again, my vote is to keep it simple.
Obviously you are absolutely correct in saying that dynamic programming
(DP) will only make co-linear alignments. However, I don't see a huge
need for mapping alignments back to DP paths. In other words, if there
was an alignment data structure which did not guarentee colinearity, what
functionality would you really lose (of course, you could keep using DP
for constructing them)?
non co-linear alignments are
- relatively easy to generate
- relatively common in biology (especially when you get into
DNA rearrangements).
I a big believer in Simple is better, but I don't think banning
non-colinear alignments gives you any great wins.
>
> David
>
> ----
> David J. States, M.D., Ph.D.
> Associate Professor and Director
> Institute for Biomedical Computing
> Washington University in St. Louis
> 700 S. Euclid Ave.
> St. Louis, MO 63110
>
> tel: 314 362 2134
> fax: 314 362 0234
> email: states@ibc.wustl.edu
>
> -----Original Message-----
> From: Rubin Eitan [SMTP:bcrubin@dapsas1.weizmann.ac.il]
> Sent: Monday, January 25, 1999 2:35 AM
> To: David J. States
> Cc: vsns-bcd-perl@lists.uni-bielefeld.de; 'Ewan Birney'
> Subject: RE: Bioperl: Start of alignment debate...
>
> Just a quick thought - you should consider the blocks type of MSA, i.e.
> were some regions are aligned (w/o gaps) and others are simply
> disregarded. An alignemnt using blocks would actually be a collection of
> alignments, but may still be used as a single alignment for purposes such
> as tree construction.
>
> Eitan.
>
>
> ======================================================================
> Eitan Rubin,
> Plant Genetics, Weizmann Inst of Science, Rehovot, Israel.
> EMail: bcrubin@dapsas1.weizmann.ac.il
> Tel: (00972)-(8)9342421 Fax: (00972)-(8)9344181
> EitanR@BioMOO (http://bioinfo.weizmann.ac.il/BioMOO) - visit
> the GCG help desk
>
> in vivo -> in vitro -> in silico
> ======================================================================
>
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://bio.perl.org/
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> ====================================================================
>
Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================