Naming the modules; Mailing lists
Georg Fuellen
fuellen@dali.Mathematik.Uni-Bielefeld.DE
Fri, 21 Feb 1997 12:12:00 +0000 (GMT)
Hi SteveB,
you wrote,
> [...]
> GF> I even consider Bio::Aln to be sufficiently general to
> GF> process alignments of numeric data, linguistic data, etc !
> Indeed, right now it can hold any sort of data -- but that is perhaps a
> weakness! This is a bioperl object and it should have features for
> supporting biological sequences. Right now, the object lacks support
> for many types of operations that people would want to do on proteins.
Ok, Let's call the module Bio::UnivAln, for it's universality. :-)
And let's call the module you're envisioning Bio::Seq::ProtAln.
> [...] As noted above, I suggest Bio::Seq::NucAln and Bio::Seq::ProtAln, and
> -- if possible -- the two are merged into Bio::Seq::Aln at some time in
> the future. I don't see why you want to preempt future improvements.
It seems that protein researchers want to have fast access to data
related to the protein sequences; in the current Bio::Aln design, this
can be done by storing stuff in $self->{'names'}{'seqs'}[$seq_index]
(this is currently done for the ID and description of the individual
sequences), or in a seperate column/row (maybe there are even better ways -
we need to brainstorm about this). (It's still unclear to me why the current
design makes your needs difficult to accomplish; but let's assume it does.)
For phylogeny calculations, you need fast and flexible access to the
individual columns/rows of the alignment, and you need lots of slicing and
mapping of evaluation functions onto these columns/rows.
> Storing it as the raw sequence (without clear reference to the original
> sequence and attachments) is the problem. Suffice it to say that to get
> things to work for many protein operations, there would need to be
> relators for vitually every operation. This would be complicated and
> inefficient.
Maybe the fact that Bio::UnivAln can hold any sort of data can be put to
use here ? I mean, you can perhaps put the additional data into the
zeroeth column / zeroeth row ?!
> I think that anything relying on PerlDL should be a 'special feature' and
> not built into the core of the "basic" alignment module. This is because
> PerlDL is non-trivial to install, which will prevent its use by a large
> fraction of potential bioperl users. That said, I would of course
> heartily endorse development of modules using PerlDL if that does prove to
> be more efficient and effective.
Agreed; I hope PerlDL will be easy to install in a year or so
(I've spent many hours trying to install it, with limited success,
so I can relate to this :-)
best wishes,
georg