Package gmisclib :: Module kl_dist
[frames] | no frames]

Module kl_dist

source code

Suppose there is a random variable with true distribution p. Then (as we will see) we could represent that random variable with a code that has average length H(p). However, due to incomplete information we do not know p; instead we assume that the distribution of the random variable is q. Then (as we will see) the code would need more bits to represent the random variable. The difference in the number of bits is denoted as D(p|q). The quantity D(p|q) comes up often enough that it has a name: it is known as the relative entropy:

       The relative entropy or Kullback-Leibler distance between two
       probability mass functions p(x) and q(x) is defined as
       D(p||q) = Sum{x in X} p(x) log(p(x)/q(x))

Note that this is not symmetric, and the q (the second argument) appears only in the denominator.

Classes
  NoConvergenceError
  NotMarkovError
Functions
 
P(x) source code
 
multinomial_logp(x, cF) source code
 
multinomial_fixer(x, c) source code
 
kl_nonzero_probs(p, q)
Kullback-Lieber distance between two normalized, nonzero probability distributions.
source code
 
kl_nonzero_prob_m(p, q) source code
 
kldist_vec(p, q, N=None, Fp=1.0, Fq=1.0, Clip=0.01)
Relative entropy or Kullback-Liebler distance between two frequency distributions p and q.
source code
 
tr_from_obs(x)
Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and return it as y[i,j].
source code
 
estimate_tr_probs(counts, N, F=1.0) source code
 
solve_for_pi(p)
Given a transition probability matrix p, where the first index is the initial state and the second index is the resultant state, compute the steady-state probability distribution, assuming a Markov process.
source code
 
kl_nonzero_tr_probs(pp, qq)
KL distance, given a matrix of nonzero transition probabilities.
source code
 
kl_nonzero_tr_prob_m(pp, qq) source code
 
cross(a, b) source code
 
kldist_Markov(p, q, N=None)
Kullback-Liebler distance between two matricies of bigram counts.
source code
 
kldist_Markov_m(p, q, N=None)
Kullback-Liebler distance between two matrices of bigram counts.
source code
 
kldist_Markov_mm(*p)
List of Kullback-Liebler distances between all combinations of pairs of matrices of bigram counts.
source code
Variables
  __package__ = 'gmisclib'

Imports: Num, mcmc, mcmc_helper, gpkavg, gpkmisc, math, random


Function Details

kldist_vec(p, q, N=None, Fp=1.0, Fq=1.0, Clip=0.01)

source code 

Relative entropy or Kullback-Liebler distance between two frequency distributions p and q. Here, we assume that both p and q are counts derived from multinomial distributed data; they are not normalized to one.

tr_from_obs(x)

source code 

Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and return it as y[i,j]. The result is a transition probability matrix where the first index is the input state, and the second index marks the result state.

kl_nonzero_tr_probs(pp, qq)

source code 

KL distance, given a matrix of nonzero transition probabilities. Each matrix indexes states as pp[from,to], and contains P(to given from) as a conditional probability, where for any from, the Sum over to( pp[from,to]) = 1.

kldist_Markov_mm(*p)

source code 

List of Kullback-Liebler distances between all combinations of pairs of matrices of bigram counts. It returns a list of matrices of all the distances. Each item on the list is a sample of the distance histogram.