Package gmisclib :: Module kl_dist

Module kl_dist

Suppose there is a random variable with true distribution p. Then (as we will see) we could represent that random variable with a code that has average length H(p). However, due to incomplete information we do not know p; instead we assume that the distribution of the random variable is q. Then (as we will see) the code would need more bits to represent the random variable. The difference in the number of bits is denoted as D(p|q). The quantity D(p|q) comes up often enough that it has a name: it is known as the relative entropy:

       The relative entropy or Kullback-Leibler distance between two
       probability mass functions p(x) and q(x) is defined as
       D(p||q) = Sum{x in X} p(x) log(p(x)/q(x))

Note that this is not symmetric, and the q (the second argument) appears only in the denominator.

Classes
	NoConvergenceError
	NotMarkovError

Functions

P(x)	source code

multinomial_logp(x, cF)

source code

multinomial_fixer(x, c)

source code

kl_nonzero_probs(p, q)
Kullback-Lieber distance between two normalized, nonzero probability distributions.

source code

kl_nonzero_prob_m(p, q)

source code

kldist_vec(p, q, N=None, Fp=1.0, Fq=1.0, Clip=0.01)
Relative entropy or Kullback-Liebler distance between two frequency distributions p and q.

source code

tr_from_obs(x)
Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and return it as y[i,j].

source code

estimate_tr_probs(counts, N, F=1.0)

source code

solve_for_pi(p)
Given a transition probability matrix p, where the first index is the initial state and the second index is the resultant state, compute the steady-state probability distribution, assuming a Markov process.

source code

kl_nonzero_tr_probs(pp, qq)
KL distance, given a matrix of nonzero transition probabilities.

source code

kl_nonzero_tr_prob_m(pp, qq)

source code

cross(a, b)

source code

kldist_Markov(p, q, N=None)
Kullback-Liebler distance between two matricies of bigram counts.

source code

kldist_Markov_m(p, q, N=None)
Kullback-Liebler distance between two matrices of bigram counts.

source code

kldist_Markov_mm(*p)
List of Kullback-Liebler distances between all combinations of pairs of matrices of bigram counts.

source code

Variables
	__package__ = `'gmisclib'`

Imports: Num, mcmc, mcmc_helper, gpkavg, gpkmisc, math, random

Function Details

kldist_vec(p, q, N=None, Fp=1.0, Fq=1.0, Clip=0.01)

source code

Relative entropy or Kullback-Liebler distance between two frequency distributions p and q. Here, we assume that both p and q are counts derived from multinomial distributed data; they are not normalized to one.

tr_from_obs(x)

source code

Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and return it as y[i,j]. The result is a transition probability matrix where the first index is the input state, and the second index marks the result state.

kl_nonzero_tr_probs(pp, qq)

source code

KL distance, given a matrix of nonzero transition probabilities. Each matrix indexes states as pp[from,to], and contains P(to given from) as a conditional probability, where for any from, the Sum over to( pp[from,to]) = 1.

kldist_Markov_mm(*p)

source code

List of Kullback-Liebler distances between all combinations of pairs of matrices of bigram counts. It returns a list of matrices of all the distances. Each item on the list is a sample of the distance histogram.