Trees  Indices  Help 



Suppose there is a random variable with true distribution p. Then (as we will see) we could represent that random variable with a code that has average length H(p). However, due to incomplete information we do not know p; instead we assume that the distribution of the random variable is q. Then (as we will see) the code would need more bits to represent the random variable. The difference in the number of bits is denoted as D(pq). The quantity D(pq) comes up often enough that it has a name: it is known as the relative entropy:
The relative entropy or KullbackLeibler distance between two probability mass functions p(x) and q(x) is defined as D(pq) = Sum{x in X} p(x) log(p(x)/q(x))
Note that this is not symmetric, and the q (the second argument) appears only in the denominator.
Classes  
NoConvergenceError  
NotMarkovError 
Functions  





























Variables  
__package__ =

Imports: Num, mcmc, mcmc_helper, gpkavg, gpkmisc, math, random
Function Details 
Relative entropy or KullbackLiebler distance between two frequency distributions p and q. Here, we assume that both p and q are counts derived from multinomial distributed data; they are not normalized to one. 
Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and return it as y[i,j]. The result is a transition probability matrix where the first index is the input state, and the second index marks the result state. 
KL distance, given a matrix of nonzero transition probabilities. Each matrix indexes states as pp[from,to], and contains P(to given from) as a conditional probability, where for any from, the Sum over to( pp[from,to]) = 1. 
List of KullbackLiebler distances between all combinations of pairs of matrices of bigram counts. It returns a list of matrices of all the distances. Each item on the list is a sample of the distance histogram. 
Trees  Indices  Help 


Generated by Epydoc 3.0.1 on Thu Sep 22 04:25:02 2011  http://epydoc.sourceforge.net 