[Bioperl-l] clustering algorithms in BioPerl
Frank Gibbons
francis_gibbons@hms.harvard.edu
Fri, 23 Feb 2001 11:42:56 -0500
Hi,
I've been lurking for about a month. I've checked out the BioPerl homepage,
including the list of projects. I notice that the bias is heavily towards
sequence analysis (naturally).
Right now I'm working on implementing a few clustering algorithms (priority #3
on the list of projects) in Perl, for use with DNA microarray data (priority
#4 on the list!). The algorithms themselves are quite general, and have been
around for a while, but I can find few references to implementations of them
in Perl. (I have seen mention of Jong Park's Geanfammer package, as a possible
source for clustering, but as far as I can see he implements only
single-linkage clustering there.) I think they would be quite useful to the
Perl community as a whole, and I would like to write them in as generic a way
as possible, which is why I'm writing to the list now, having implemented only
one particular algorithm, before I write any more!
So, my questions are:
* Is this appropriate for BioPerl in the first place? Would it be more
suitable for CPAN? The algorithms are general, but my focus is on
BioInformatics.
* If so, does any one know of other work that may have been done in this area,
on which I could build/integrate with?
* Do you have any suggestions? I'm thinking in terms of
- Naming schemes
- Particular algorithms which should be implemented as a priority
- Other possible applications, which I should keep in mind
- Pitfalls I should look out for
Thanks for your input,
Frank Gibbons
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
PhD, Computational Biologist, Harvard Medical School
Dept of Biological Chemistry and Molecular Pharmacology
240 Longwood Avenue, C-125, Boston, MA 02115