ECSE 506: Stochastic Control and Decision Theory

Aditya Mahajan
Winter 2016

About  |  Lectures  |  Coursework


Assignment 2 (due Feb 24, 2016)

\(\def\PR{\mathbb P} \def\EXP{\mathbb E} \def\reals{\mathbb R} \def\integers{\mathbb Z}\)

Note: You may write code in any programming language for the problems that involve numerical solution.

  1. There are \(n\) jobs to be processed by a single machine. Processing job \(i\) takes a random time \(S_i\). Let \(μ_i = \EXP[S_i] < ∞\). A cost \(c_i\) is incurred per unit time until job \(i\) is completed.

    For example, suppose there are two jobs, and completing job \(1\) and \(2\) took \(s_i\) and \(s_2\) time units respectively. If the processing order was \(\{1,2\}\), then the cost incurred is \(c_1 s_1 + c_2 (s_1 + s_2)\). If the processing order was \(\{2,1\}\), then the cost incurred is \(c_2 s_2 + c_1 (s_1 + s_2)\).

    In what order should the jobs be processed so as to minimize the total expected cost?

  2. A prisoner wishes to escape from a prison. There are \(n\) passages that leads out of the prison, some of them are dead-ends, others have guards, and remaining are unguarded are lead outside the prison. More precisely, passage \(i\) might be a dead-end with probability \((1-p_i-q_i)\), might have a guard with probability \(q_i\), and might be unguarded and lead outside the prison with probability \(p_i\). Thus, if the prisoner tries passage \(i\), he will find that the passage is a dead-end with probability \((1-p_i-q_i)\) and he can come back and try other passages; he will be captured by a guard with probability \(q_i\), and he will escape with probability \(p_i\). In what order should the prison test different passages if he wishes to maximize his probability of eventual escape? Does the optimal search order change if his objective is to minimize the probability of capture?

  3. Let \(Q(x,u) = \begin{bmatrix} x \\ u \end{bmatrix}^\intercal M \begin{bmatrix} x \\ u \end{bmatrix}\), where \(M = \begin{bmatrix} P & Q \\ Q^\intercal & R \end{bmatrix}\), \(M\) is symmetric (and, therefore, so are \(P\) and \(R\)) and \(M\) and \(R\) are positive-definite.

    Define \(V(x) = \min _ {u} Q(x,u)\). Show that \(V(x) = x^\intercal S x\), where \(S\) is symmetric positive semidefinite (find \(S\) explicitly).

  4. Consider a deterministic LQR system with system dynamics: \(x_1 = [0\ 0]^\intercal\) and

    \[ x _ {t+1} = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} x _ t + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u _ t\]

    Let \(Q = \text{diag}(2, 1)\), \(R = 1\), and \(T = 20\). The objective is to minimize the following tracking cost:

    \[ \sum_{t=1}^{T-1} \big[ (x_t - x^0_t)^\intercal Q (x_t - x^0_t) + u_t \intercal R u_t \big] + (x_T - x^0_T)^\intercal Q (x_T - x^0_T)\]

    where \(x^0_t\) is a reference signal given by

    \[ x^0_t = \begin{cases} [1\ 0]^\intercal & t \le 7 \\ [0\ 1]^\intercal & 7 < t \le 14 \\ [1\ 0]^\intercal & 14 < t \le 20 \end{cases}\]

    1. Compute the optimal control strategy and plot \(\{x_t\}\) and \(\{u_t\}\) over time. (See notes for explicit formulas for optimal tracking problems).

    2. Now suppose the system is stochastic and the dynamics are given by

      \[ x _ {t+1} = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} x _ t + \begin{bmatrix} 0 \\ 1 \end{bmatrix} u _ t + w_t \]

      where \(w_t\) is zero-mean Guassian noise with covariance \(\Sigma = \begin{bmatrix} 0.3 & 0.1 \\ 0.1 & 0.2 \end{bmatrix}\). When the control is chosen optimally, plot \(\{X_t\}\) and \(\{U_t\}\) for two different realizations of the noise process.