Please enable JavaScript.
Coggle requires JavaScript to display documents.
Information Theory - Coggle Diagram
Information Theory
Motivation
-
-
-
Typical messages:
- almost surely typical
- uniformly distributed
-
-
-
Mutual Information
Definitions
I(X;Y)=H(X)-H(X|Y)
=H(X)+H(Y)-H(X,Y)
=H(Y)-H(Y|X)
\(=D(P_{X,Y}||P_XP_Y)\)
- I(X;Y)=H(X)-H(X|Y)
- I(X;Y|Z)=H(X|Z)-H(X|Y,Z)
Properties
-
\(0\leq I(X;Y) \leq H(X)\)
=0 iff X,Y independent
-
Data processing inequalities
\(X\perp Z|Y\):
- \(I(X;Z)\leq I(X;Y)\)
- \(I(X;Y|Z)\leq I(X;Y) \)
Synergy, Redundancy
\(S(X;Y_1,Y_2)\)
-
-
-
Types, Large Deviations
Types
Definitions
n-types
-
\(\mathcal{P}_n = \{ P\in\mathcal{P} : nP(a)\in\mathbb{Z},\ \forall a \in A\}\)
-
Probability simplex
\( \mathcal{P} = \{ P\in [0,1]^m : \sum_{a \in A} P(a) = 1\} \)
Results
-
Type class:
Under IID, strings of same type have the same probability
-
-
-
-
-
Proposition 9.5: Probability of Type Class
Type \(P\in\mathcal{P}_n \), pmf Q on finalph A size m
-
Sanov's Theorem
Motivation + intro
-
1000 dice throws, average of 5 (>3.5, thus a rare event). What are the proportion of 6s? \(P(\#(6) | \bar{X}=5)\)
\(\hat{P}_n\) edf of iid seq \(X_1^n\) with pmf \(Q\) on \(A\).
For any \(E\subset \mathcal{P}\):
\[Q^n(\hat{P}_n \in E) \leq (n+1)^m 2^{-n \inf_{P\in E} D(P||Q) }\]
If \(E\) is equal to the closure of its interior:
\[ \lim -\frac{1}{n} \log Q^n (\hat{P}_n \in E) = \inf_{P\in E} D(P||Q) \]
-
-
Relative entropy
-
Definitions/Properties
Entropy Bounds
-
\(0\leq H(X) \leq \log|A|\)
- =0 iff X constant
- log|A| iff X uniformly distributed on A
-
-
-
-
-
-
Joint Entropy
\(H(X,Y)=H(P_{X,Y})=E[-\log P_{X,Y}(X,Y)]\)
Independent X,Y
H(X,Y)=H(X)+H(Y)
-
-
-
-
Poisson Approximation
Bern-Poisson relation
-
V2:
\(S_n\) sum of n Berns
\(D_e (P_{S_n}||Po(\sum p_i) ) \leq \sum p_i^2 + \sum H_e(X_i)-H_e(X_1^n)\)
-
-
-