Jethro's Braindump

Optimal Control and Planning

How can we make decisions if we know the dynamics of the environment?

Stochastic optimization

Stochastic optimization for open-loop planning:

We wish to choose \(a_1, \dots a_T = \mathrm{argmax}_{a_1, \dots a_T} J(a_1, \dots, a_T)\) for some objective \(J\).

Guess and Check

An extremely simple method, that’s parallelizable:

  1. pick \(A_1, \dots A_N\) from some distribution
  2. choose \(A_i\) based on \(\mathrm{argmax} J(A_i)\).

Cross-entropy Method (CEM)

  1. pick \(A_1, \dots A_N\) from some initial distribution \(p(A)\)
  2. Evaluate \(J(A_1), \dots J(A_N)\)
  3. pick the elites \(A_{i1}, \dots A_{im}\) with the highest value
  4. fit distribution $P(A) to the elites

With continuous inputs, a multi-variate normal distribution is a common choice for \(p(A)\). In the discrete case, Monte Carlo Tree Search is typically used.

Using Derivatives

  • Differentiable Dynamic Programming (DDP)
  • LQR