Optimal Control and Planning
How can we make decisions if we know the dynamics of the environment?
Stochastic optimization
Stochastic optimization for open-loop planning:
We wish to choose \(a_1, \dots a_T = \mathrm{argmax}_{a_1, \dots a_T} J(a_1, \dots, a_T)\) for some objective \(J\).
Guess and Check
An extremely simple method, that’s parallelizable:
- pick \(A_1, \dots A_N\) from some distribution
- choose \(A_i\) based on \(\mathrm{argmax} J(A_i)\).
Cross-entropy Method (CEM)
- pick \(A_1, \dots A_N\) from some initial distribution \(p(A)\)
- Evaluate \(J(A_1), \dots J(A_N)\)
- pick the elites \(A_{i1}, \dots A_{im}\) with the highest value
- fit distribution $P(A) to the elites
With continuous inputs, a multi-variate normal distribution is a common choice for \(p(A)\). In the discrete case, Monte Carlo Tree Search is typically used.
Using Derivatives
- Differentiable Dynamic Programming (DDP)
- LQR