Optimal Control and Planning

How can we make decisions if we know the dynamics of the environment?

Stochastic optimization

Stochastic optimization for open-loop planning:

We wish to choose $a_1, \dots a_T = \mathrm{argmax}_{a_1, \dots a_T} J(a_1, \dots, a_T)$ for some objective $J$.

Guess and Check

An extremely simple method, that’s parallelizable:

pick $A_1, \dots A_N$ from some distribution
choose $A_i$ based on $\mathrm{argmax} J(A_i)$.

Cross-entropy Method (CEM)

pick $A_1, \dots A_N$ from some initial distribution $p(A)$
Evaluate $J(A_1), \dots J(A_N)$
pick the elites $A_{i1}, \dots A_{im}$ with the highest value
fit distribution $P(A) to the elites

With continuous inputs, a multi-variate normal distribution is a common choice for $p(A)$. In the discrete case, Monte Carlo Tree Search is typically used.

Using Derivatives

Differentiable Dynamic Programming (DDP)
LQR

Jethro's Braindump

Optimal Control and Planning

Stochastic optimization

Guess and Check

Cross-entropy Method (CEM)

Using Derivatives

Links to this note