[Bioperl-l] Bio::Cluster::Family

Shawn shawnh@fugu-sg.org
30 Sep 2002 20:33:48 +0800


Hi folks,

	I'm currently working on a protein clustering pipeline and am proposing
the following object for storing of families. It is simple and works for
me right now. Proposed usage:


use Bio::Cluster::Family;

my $prot1 = Bio::Seq->new(-id=>"Q8W551",
                               -alphabet=>"protein",
			       -species=>"homo sapiens",
                               -desc=>"POLYUBIQUITIN");
my $prot2 = Bio::Seq->new(-id=>"Q92332",
                               -alphabet=>"protein",
                               -species=>"Mus musculus",
                               -desc=>"POLYUBIQUITIN");
my $prot3 = Bio::Seq->new(-id=>"Q94EA5",
                               -alphabet=>"protein",
                               -species=>"Fugu rubripes",
                               -desc=>"H01 26 protein");

my $family = Bio::Cluster::Family->new(-family_id=>"TribeFamily",
				       -members=>[$prot1, $prot2 						,$prot3],
				       -description=>"POLYUBIQUITIN",
                                       -annotation_score=>60);

print $family->size."\n";
print $family->description."\n";
print $family->annotation_score."\n"; #the confidence in which the
				      #consensus
                                 #description is assigned to the family
                                 #by TribeMCl 

my @members = $family->get_members_by_species("homo sapiens");


foreach my $seq ($family->members){
	print $seq->desc."\n";
}


Currently, I have ported over the Ensembl Logic of generating family
consensus descriptions to the TribeMCL wrapper. It will also be nice to
add a SimpleAlign get/set for storing multiple alignments. The name
space can also be changed to TribeFamily if Family alone is too
encompassing.

all comments welcomed.


cheers,
shawn