[Biojava-l] Distributions over infinite Alphabets

Schreiber, Mark mark.schreiber at agresearch.co.nz
Thu Apr 17 23:19:15 EDT 2003


Hi -
 
The Double Alphabet doesn't have the required method. It would be useful to be able to make a distribution and specify that all Symbols in the ambiguity [0.0 to 1.0] have probability 0.3 etc. Eg it would be useful to be able to train the distribution.
 
Equally useful would be a method to construct a (for example) Gaussian distribution. This seems simpler as you could just implement Distribution, make a factory or constructor that took the mean and std dev and returns the appropriate stuff when asked to sample a Symbol. Getting the weight of the Symbol sensibly would still need the ambiguity Symbol.
 
I'm not too clear on how you would make such a Symbol as it technically contains an infinite number of AtomicSymbols.
 
- Mark
 

	-----Original Message----- 
	From: Matthew Pocock [mailto:matthew_pocock at yahoo.co.uk] 
	Sent: Thu 17/04/2003 8:24 p.m. 
	To: Schreiber, Mark; biojava-l at biojava.org 
	Cc: 
	Subject: Re: [Biojava-l] Distributions over infinite Alphabets
	
	

	The distribution interface is a bit of a misnomer - I
	guess what we wanted was the integral over a PDF, but
	because we nearly always used descrete alphabets,
	nobody cared.
	
	So - the short answer is that you should be geting the
	probability of an ambiguity symbol over [0.0 ..
	1.0]and the distribution impl should be integrating
	PDF out over that range e.g. a gausian or something.
	
	Does DoubleAlphabet have methods to make these kinds
	of ambiguities? If not we need to add it.
	
	Matthew
	
	 --- "Schreiber, Mark"
	<mark.schreiber at agresearch.co.nz> wrote:
	> Hi -
	> 
	> Currently you can make a Distribution over (for
	> example) the Double alphabet and you can train it or
	> assign a weight to a value (eg the probability of
	> the 2.0 Symbol could be set to 0.5).
	> 
	> Can anyone think of a way to represent a probability
	> density as a Distribution? For example you may want
	> to set the probability of seeing a value between 0
	> and 1.0 to be 0.8. This is a bit tricky as there are
	> an infinite number of values between 0 and 1.0 and
	> the value of getting exactly 0.237865765 would be
	> infinitely small.
	> 
	> Would this best be represented using something other
	> than a Distribution?
	> 
	> - Mark
	> 
	>
	>
	=======================================================================
	> Attention: The information contained in this message
	> and/or attachments
	> from AgResearch Limited is intended only for the
	> persons or entities
	> to which it is addressed and may contain
	> confidential and/or privileged
	> material. Any review, retransmission, dissemination
	> or other use of, or
	> taking of any action in reliance upon, this
	> information by persons or
	> entities other than the intended recipients is
	> prohibited by AgResearch
	> Limited. If you have received this message in error,
	> please notify the
	> sender immediately.
	>
	=======================================================================
	>
	> _______________________________________________
	> Biojava-l mailing list  -  Biojava-l at biojava.org
	> http://biojava.org/mailman/listinfo/biojava-l
	
	__________________________________________________
	Yahoo! Plus
	For a better Internet experience
	http://www.yahoo.co.uk/btoffer
	


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list