Bioperl: Alignment proposal - rough draft

Ewan Birney birney@sanger.ac.uk
Mon, 25 Jan 1999 18:39:52 +0000 (GMT)


The alignment debate has been very interesting and
it has stimulated alot of thought in my own head. Partly
to get this down on paper and partly to keep the momentum
up, I have started a firmer 'proposal' about what the
alignment objects (coordinated objects in fact) should look
like. It is at

http://bio.perl.org/Projects/SeqAlign/proposal.html

and just so that people can read about it here, I'll add
in the text form of this page at the end.

Notice that

	a) I don't think this is finished at all (I haven't
fleshed out the interfaces at all).

	b) It is all subject to change

Please feel free to comment, criticise and generally change. I am
quite keen to get one or two more people actively involved in the
coding/documenting aspect, both because it saves me time but more
importantly because with someone else reviewing the design and code
there is a much better chance of being useful rather than a number
of 'nice ideas ewan had in janurary 1999'. See also my next mail...

Anyway - please criticise....



The proposal is as follows

* We specify a small heirarchy of interfaces to alignment objects, being something like

Bio::Align::AbstractBasic    # basic alignment functions
Bio::Align::AbstractComplex  # more involved functions, such as DNA to DNA alignments

Bio::Align::AbstractComplex inheriets from Bio::Align::AbstractBasic

* We specify a number of other interfaces used as mix-ins to Alignment objects, in particular

Bio::AbstractTree  # abstract representation of the tree

* We suggest that a number of different alignment implementations can be provided that satisify the
Abstract classes above (starting with SimpleAlign and UnivAlign). Alignment implementations are
free to offer their own useful additions but functions which want to portable between different
implementations should stick to the Abstract interfaces. Implementations that have
Bio::Align::AbstractComplex must have AbstractBasic in. Implementations are free as to whether they
use Bio::AbstractTree or not.

The split between interface and implementation is important so that we can have different but
cooperating implementations out there and also so we can flip over to a C/XS based one if need be
(I think eventually it will be useful for just memory constraints).                          

* The output IO is placed into a different object, preferably one for each format (so that we can 100
formats without loading up 100 different functions when we use a module). It would be

Bio::Align::IO::Abstract.pm # provides a template for what each alignment IO factory should make
Bio::Align::IO::Clustalw.pm # clustalw aln format

So that the same format IO system can potentially intialise different implementation, we have to
have construction rules from strings or another system (probably time to reach for design patterns
and have a good read).

* We have a test suite system which implementations can register with so that we test all
implementations against the abstract interfaces

The AbstractInterfaces are normal subroutine definitions which throw an exception, such as

=head2 name

Title   : name
Usage   : $name = $self->name()
Function:
Returns : string of the name of the alignment (undef if none).
Args    : none

=cut

sub name{
my ($self,@args) = @_;

$self->throw("Abstract method, Bio::Align::AbstractBasic
- should be filled by subclass so this is an
error in your implementation");

}


Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================