[Biojava-dev] bits of information

Schreiber, Mark mark.schreiber at agresearch.co.nz
Wed Jun 4 11:45:17 EDT 2003


Hi -

I agree that the method currently returns bits of entropy. I think there
is a reasonably well accepted definition of Information which is well
developed for molecular biology in the publications of Tom Schneider
http://www.lecb.ncifcrf.gov/~toms/.

 While there is some confusion about the nature of information there
seems to be a consensus in molecular biology that Information represents
the decrease in uncertainty by the receiver and as Francois says it is
traditionally expressed as maximum possible entropy - current entropy. I
think this is intuitive as well. If my uncertainty is reduced to zero I
must have been very well informed.

For the sake of making my sequence logos work can we make the origional
method produce the original result and make a new method that returns
totalEntropy or bits of Entropy? If there are no strong objections I
would like to put this in CVS. I can back track it if people really
object.

- Mark


> -----Original Message-----
> From: Lachlan Coin [mailto:lc1 at sanger.ac.uk] 
> Sent: Wednesday, 4 June 2003 3:14 a.m.
> To: Francois Pepin
> Cc: biojava-dev at biojava.org; Schreiber, Mark
> Subject: RE: [Biojava-dev] bits of information
> 
> 
> Yeah sure - good idea to have both methods.  It seems like 
> calling either method 'information' would be confusing, so we 
> should rename the current method entropy, and add a new 
> method but not call it information either.
> 
> Lachlan
> 
> 
> On Tue, 3 Jun 2003, Francois Pepin wrote:
> 
> > I think that the name is misleading.
> >
> > It's obviously a measure of information, but it gives back the 
> > entropy.
> >
> > Just saying that something returns the information content is not 
> > quite correct in this case as it returns the entropy.
> >
> > The documentation should definitely be cleared up to make 
> that clear.
> >
> > I think that adding the method in question would be a good idea as 
> > well.
> >
> > Francois
> >
> > -----Original Message-----
> > From: biojava-dev-bounces at biojava.org 
> > [mailto:biojava-dev-bounces at biojava.org] On Behalf Of Lachlan Coin
> > Sent: 3 juin, 2003 10:40
> > To: Francois Pepin
> > Cc: biojava-dev at biojava.org; 'Schreiber, Mark'
> > Subject: RE: [Biojava-dev] bits of information
> >
> >
> > The definitions are formal, and we all agree with the definition of 
> > entropy.
> >
> > Shannon's first coding theorem, tells us that
> > the entropy of an information source is equal to the 
> minimum average 
> > number of bits per symbol that must (and can in the limit) 
> be used to 
> > encode source outputs.  So, if I try to communicate to you (using 
> > binary uniquely decipherable code) the outcome of sampling from a 
> > source X which has  entropy H(X), then I must use at least 
> H(X) bits 
> > per symbol (if I am not to lose any information) and in the 
> limit of 
> > transmiting
> > N-> infinity symbols, I can achieve an average H(X) bits per code.
> >
> > Thus, H(X) - the entropy  - is a natural measure of the information 
> > content of a distribution.  This is what the method is returning at 
> > the moment.
> >
> > Lachlan
> >
> >
> 
> -------------------------------------------------------------
> Lachlan Coin
> Wellcome Trust Sanger Institute		Magdalene College
> Cambridge  CB10 1SA			Cambridge CB30AG
> Ph: +44 1223 494 820
> Fax: +44 1223 494 919
> ------------------------------------------------------------
> 
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the biojava-dev mailing list