[Biojava-dev] bits of information

Francois Pepin fpepin at cs.mcgill.ca
Tue Jun 3 10:31:44 EDT 2003


I disagree on that one. The definitions are pretty formal and not based
on intuition.

Your definition of information there is actually the definition of
entropy. Information is indeed maximum entropy - current entropy.

A distribution with 100% A has a 0 entropy and maximal information (you
always know what you're going to hit). An all 25% distribution has
maximal entropy and 0 information, as you don't know anything to help
you decide what the next one would be.

The method below does seem to be returning bits of entropy rather than
information (although I haven't had the time to go through the code to
be sure).

Francois

-----Original Message-----
From: biojava-dev-bounces at biojava.org
[mailto:biojava-dev-bounces at biojava.org] On Behalf Of Lachlan Coin
Sent: 3 juin, 2003 07:44
To: Schreiber, Mark
Cc: biojava-dev at biojava.org
Subject: Re: [Biojava-dev] bits of information


Hi,

I guess it all depends on your intuition about what information actually
means, but sticking to standard definitions, the low bits of information
reflects the fact that there is not much  uncertainty in this
distribution.  If the distribution was 100% A, then  there would be no
uncertainty, and bits of information should return 0.  On the other
hand, information (or uncertainty) is maximised with 25% A,C,G,T.

Lachlan


On Sun, 1 Jun 2003, Schreiber, Mark wrote:

> Hi -
>
> The bitsOfInformation() method from DistributionTools seems to be 
> returning only the average weighted entropy not the actual 
> information.
>
> Eg for a distribution made thus:
>
>       //set the weight of a to 0.97
>       dist.setWeight(DNATools.a(), 0.97);
>       //set the others to 0.01
>       dist.setWeight(DNATools.c(), 0.01);
>       dist.setWeight(DNATools.g(), 0.01);
>       dist.setWeight(DNATools.t(), 0.01);
>
> The bits of information is calculated to be: 0.24194073285321088 bits
>
> This strikes me as a bit low (excuse the pun). Possibly there should 
> be a method called totalEntropy and bits of information should return 
> log2(alpha size) - totalEntropy.
>
> - Mark
>
>
> ======================================================================
> =
> Attention: The information contained in this message and/or
attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or
privileged
> material. Any review, retransmission, dissemination or other use of,
or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by
AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
>
=======================================================================
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at biojava.org 
> http://biojava.org/mailman/listinfo/biojava-dev
>

-------------------------------------------------------------
Lachlan Coin
Wellcome Trust Sanger Institute		Magdalene College
Cambridge  CB10 1SA			Cambridge CB30AG
Ph: +44 1223 494 820
Fax: +44 1223 494 919
------------------------------------------------------------

_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org http://biojava.org/mailman/listinfo/biojava-dev



More information about the biojava-dev mailing list