ECSE 506: Stochastic Control and Decision Theory

Aditya Mahajan
Winter 2016

About  |  Lectures  |  Coursework


Assignment 5 (practice problems)

\(\def\PR{\mathbb P} \def\EXP{\mathbb E} \def\IND{\mathbb 1} \def\TRANS{\intercal} \def\reals{\mathbb R} \def\integers{\mathbb Z}\)

  1. Consider the following decentralized variation of sequential hypothesis testing.

    1. Suppose the strategy of sensor 2 is fixed. Show that the problem of finding the best response at sensor 1 is a POMDP. Identify the information state and the dynamic program.

    2. Use the dynamic program of the previous part to show that (for any arbitrary strategy of sensor 2), the strategy of sensor 1 has a threshold property similar to the threshold property for centralized sequential hypothesis testing problem.

  2. Consider the boardcast information structure with two users, indexed by \(i\), \(i \in \{0, 1\}\). The states and actions of user \(i\) are Eucledian vectors. The dynamics are given by: \[\begin{align} X^0_{t+1} &= A^0 X^0_t + B^0 U^0_t + W^0_t \\ X^1_{t+1} &= A^{10} X^0_t + A^{11} X^1_t + B^1 U^1_t + W^1_t \end{align}\] where \(A^0\), \(A^{10}\), \(A^{11}\), \(B^0\), and \(B^1\) are matrices of appropriate dimensions. The primitive random variables: \(\{X^0_1, X^1_1, W^0_{1:T}, W^1_{1:T}\}\) are independent.

    Simify the dynamic program derived in class for the above LQG case. (Hint: This simiplification is similiar to that for the one-step delayed sharing information structure.)