Markov Chains and partial views

From GusWiki

Jump to: navigation, search

See Markov Chains and partial views/Coarsenings


Contents

[edit] Goal

Given two Markov Chains $M_1 = <A,X>$ and $M_2 = < B,Y>$ (i.e. "views"), and a "coupling parameter" (based on their -:mutual information $I(X;Y)$), construct a MC over $A \times B $ satisfying the coupling parameter.

[edit] mutual information

Forgetting the structure of the time steps (i.e. time steps are exchangeable), how much does knowing the state of $X$ tells us about the state of $Y$?

[edit] entropy of a MC

Let $X$ be a MC. Then

H(X) = \sum_{s\in states(X)} P(s) H(X^{t+1}|X^t=s) where $P(s)$ refers to steady-state probability of state $s$.


I(X; Y) \equiv H(X) + H(Y) - H(X,Y)


If we want time-steps to not be exchangeable (i.e. treat it like a time series), could we instead define mutual information in terms of higher-order MCs?

[edit] an example

Let $X$ and $Y$ be:


I've convinced myself that the only allowed value for $I(X;Y)$ is 0. This is because coupling them will compromise first-order Markovianness.


I wanted to generalize this to:

if every state is reachable from every state
and there is no weighted-graph isomorphism
then the only allowed value for $I(X;Y)$ is 0
i.e. no way to couple $X$ and $Y$ without losing Markovianness.


But this is wrong: coarsenings, for example, are a counterexample of the above.

Another counterexample is this pair:

They are coarser in different parts: X2 and X3 collapse into X7; X8 and X9 collapse into X4.


[edit] the lattice of coarsenings

The "coarsens" relation is partial order, but not just any partial order: it induces a lattice over the set of all Markov Chains (I think). This means that given any two MCs, there is a unique l.u.b. $X \vee Y$ (analogous to LCM) and a unique g.l.b. $X \wedge Y$ (analogous to GCD).

The least element of this lattice (written $0$) is the trivial 1-state Markov Chain. I think the greatest element could also be defined in a non-standard way as the infinite complete graph with uniform weights, in terms of convergence.

One way to try to compute $X \vee Y$ is to take their -:Kronecker product: this is analogous to integer multiplication, in that it works iff $X$ and $Y$ are coprime (for numbers, $GCD(a,b) * LCM(a,b) = ab$; jcreed argues that it cannot be true for MCs: the Kronecker product $A \otimes B$ has 6x6 dimensions while $(A \vee B) \otimes (A \wedge B)$ has 7x5). I don't know a general method.

Likewise, I don't know a general method for computing $X \wedge Y$.


Hypothesis: Let $X$ and $Y$ be MCs where every state is reachable from every state. Then $I(X;Y)$ can only be 0 $\iff$ the Markov Chain $glb(X,Y)$ has 1 state, i.e. in lattice terms: $X \wedge Y = 0$ (in number theory terms, $X$ and $Y$ are "relatively prime").

Since most pairs of MCs are coprime (here, "most" = negation of "measure 0"), then no degree of coupling is allowed. This means that we are not robust to noise in $X$ and $Y$...

Perhaps I should consider finding the smallest modification to $(X,Y)$ that make it possible to couple them to the given value for the mutual information.

[edit] Bibliography

[edit] to read

[1] is a logic paper(!), about Markov Chains coarsenings.

[2] discusses Kronecker products of Markov Chains, as a compact representation for very large MCs.


  1. Baier, C., Hermanns, H., Katoen, J., & Wolf, V. Bisimulation and Simulation Relations for Markov Chains. Bib
  2. Langville, A. N. ((2002). Preconditioning for Stochastic Automata Networks.). Bib
Personal tools