[Bioperl-l] Bio::Tools::OddCodes

Gatherer, D. (Derek) D.Gatherer@organon.nhe.akzonobel.nl
Fri, 14 Jul 2000 09:21:43 +0200


Thanks Heikki

I realise now I should have included the sources for the alphabets:

The Stanfel alphabet was devised by Larry Stanfel:

Stanfel LE (1996) A new approach to clustering the amino acids.  J. theor.
Biol. 183, 195-205.

The Dayhoff and Sneath alphabets I took from Stanfel's paper, Figure 1 for
Sneath, and Dayhoff given in text p.197.  The original references to these
alphabets are given in Stanfel's ref. list.

The remaining alphabets were taken from:

Karlin S, Ost F and Blaisdell BE (1989)  Patterns in DNA and amino acid
sequences and their statistical significance.  Chapter 6 of: Mathematical
Methods for DNA Sequences.  Waterman MS (ed.)  CRC Press, Boca Raton , FL.

Table 4 of that chapter.

In answer to your question about synonymous and non-synonymous, I think that
Dayhoff's alphabet is the closest to this, since according to Stanfel (ref
above, p196-7)

"A quite different approach to amino acid classification is taken by those
interested in evolutionary differences.  Dayhoff et al (1978) is a
well-known example.  Given a grouping of proteins into functional families,
one tallies the frequencies with which one amino acid apparently substitutes
for another, and computes similarities or distances as some functions of
these values.  The more frequently alpha substitutes for beta, the greater
the similarity.  Though Dayhoff et all did not apply any formal clustering
methodologies ... [snipped critical digression on Dayhoff's methods...], the
amino acids were partitioned into groups which showed generally greater
substitutability within themselves than between different groups."

Hope this helps

Best wishes
Derek
-----Original Message-----
From: Heikki Lehvaslaiho [mailto:heikki@ebi.ac.uk]
Sent: 13 July 2000 10:06
To: bioperl-l
Subject: [Bioperl-l] Bio::Tools::OddCodes



Dear All,

I just notices that Derek Gatherer
(D.Gatherer@organon.nhe.akzonobel.nl) have submitted a nice class to
rename amino acid sequences according to some amino acid properties.
See below for synopsis. Thanks Derek!

In a related note: I'd like to ask if anyone on the list might know is
there is commonly agreed definition to yet an other way of amino acid
change classification: Synonymous and non-synonymous are commonly used
terms in population genetic papers to classify coding region mutations
but they are never defined. Most probably their definiton is based on
something similar to Derek's 'functional' alphabet. 

Can anyone tell me if synonymous amino acid changes are _identical_ to
Derek's functional change?

Yours,
	-Heikki

head1 NAME

Bio::Tools::OddCodes - Object holding alternative alphabet coding for 
one protein sequence

=head1 SYNOPSIS

Take a sequence object from eg, an inputstream, and creates an object 
for the purposes of rewriting that sequence in another alphabet.
These are abbreviated amino acid sequence alphabets, designed to 
simplify the statistical aspects of analysing protein sequences, 
by reducing the combinatorial explosion of the 20-letter alphabet.  
These abbreviated alphabets range in size from 2 to 8.

Creating the OddCodes object, eg:

        my $inputstream = Bio::SeqIO->new( -file => "seqfile", -format
=>
'Fasta');
        my $seqobj = $inputstream->next_seq();
        my $oddcode_obj = Bio::Tools::Oddcodes->new($seqobj);

or:
        my $seqobj = Bio::PrimarySeq->new(-seq=>'[cut and paste a
sequence
here]', -moltype = 'protein', -id = 'test');
        my $oddcode_obj  =  Bio::Tools::OddCodes->new($seqobj);

do the alternative coding, returning the answer as a reference to a
string

        my $output = $oddcode_obj->structural();
        my $output = $oddcode_obj->chemical();
        my $output = $oddcode_obj->functional();
        my $output = $oddcode_obj->charge();
        my $output = $oddcode_obj->hydrophobic();
        my $output = $oddcode_obj->Dayhoff();
        my $output = $oddcode_obj->Sneath();
        my $output = $oddcode_obj->Stanfel();
        

display sequence in new form, eg:

        my $new_coding = $$output;
        print "\n$new_coding";

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l