\(\def\PR{\mathbb P} \def\EXP{\mathbb E} \def\reals{\mathbb R}\)
Notes:
This assignment is based on the prereq material for this course.
Show that for a Markov process \(\{X_t\}_{t=1}^\infty\) and any \(t\), \(n\), and \(x_{1:t+n}\), \[\PR(X_{t+n} = x_{t+n} \mid X_{1:t} = x_{1:t}) = \PR(X_{t+n} = x_{t+n} \mid X_{t} = x_{t})\]
Consider a dynamical system in which the state evolves as follows:
\[X_{t+1} = f_t(X_t, U_t, W_t), \quad t = 1, 2, \dots\]
where \(\{W_t\}_{t=1}^{T}\) is an i.i.d. process that is independent of the initial state \(X_1\). The control inputs \(\{U_t\}_{t=1}^T\) are generated using a control strategy \(\mathbf g = \{g_t\}_{t=1}^T\) as follows:
\[U_t = g_t(X_{1:t}, U_{1:t-1}).\]
Show that for any control strategy \(\mathbf g\) and any \(t\), \(x_{1:t+1}\), and \(u_{1:t}\),
\[\PR^{\mathbf g}(X_{t+1} = x_{t+1} \mid X_{1:t} = x_{1:t}, U_{1:t} = u_{1:t}) = \PR^{\mathbf g}(X_{t+1} = x_{t+1} \mid X_{t} = x_{t}, U_{t} = u_{t}).\]
Furthermore, show that the right hand term does not depend on the choice of the control strategy \(\mathbf g\).
Let \(\mathcal X\), \(\mathcal Y\), and \(\mathcal U\) be finite sets and \(c \colon \mathcal Y \times \mathcal U \to \reals\). Let \(X \in \mathcal X\) and \(Y \in \mathcal Y\) be correlated random variables defined on a common probability space \((\Omega, \mathfrak F, P)\). Define
\[g^*(x) = \arg \min_{u \in \mathcal U} \EXP[ c(y, u) \mid X = x ].\]
Show that for any \(g \colon \mathcal X \to \mathcal U\),
\[\EXP[ c(Y, g(X)) ] \ge \EXP[ c(Y, g^*(X)) ].\]