Suppose there is a random variable with true distribution p. Then (as
we will see) we could represent that random variable with a code that has
average length H(p). However, due to incomplete information we do not
know p; instead we assume that the distribution of the random variable is
q. Then (as we will see) the code would need more bits to represent the
random variable. The difference in the number of bits is denoted as
D(p|q). The quantity D(p|q) comes up often enough that it has a name: it
is known as the relative entropy:
Note that this is not symmetric, and the q (the second argument)
appears only in the denominator.
|
|
|
|
|
|
|
kl_nonzero_probs(p,
q)
Kullback-Lieber distance between two normalized, nonzero probability
distributions. |
source code
|
|
|
|
|
kldist_vec(p,
q,
N=None,
Fp=1.0,
Fq=1.0,
Clip=0.01)
Relative entropy or Kullback-Liebler distance between two frequency
distributions p and q. |
source code
|
|
|
tr_from_obs(x)
Given a matrix of P(i and j) as x[i,j], we compute P(j given i) and
return it as y[i,j]. |
source code
|
|
|
|
|
solve_for_pi(p)
Given a transition probability matrix p, where the first index is the
initial state and the second index is the resultant state, compute
the steady-state probability distribution, assuming a Markov process. |
source code
|
|
|
|
|
|
|
|
|
kldist_Markov(p,
q,
N=None)
Kullback-Liebler distance between two matricies of bigram counts. |
source code
|
|
|
kldist_Markov_m(p,
q,
N=None)
Kullback-Liebler distance between two matrices of bigram counts. |
source code
|
|
|
|