Optimal Control and Planning

How can we make decisions if we know the dynamics of the environment?

Stochastic optimization

Stochastic optimization for open-loop planning:

We wish to choose $$a_1, \dots a_T = \mathrm{argmax}_{a_1, \dots a_T} J(a_1, \dots, a_T)$$ for some objective $$J$$.

Guess and Check

An extremely simple method, that’s parallelizable:

1. pick $$A_1, \dots A_N$$ from some distribution
2. choose $$A_i$$ based on $$\mathrm{argmax} J(A_i)$$.

Cross-entropy Method (CEM)

1. pick $$A_1, \dots A_N$$ from some initial distribution $$p(A)$$
2. Evaluate $$J(A_1), \dots J(A_N)$$
3. pick the elites $$A_{i1}, \dots A_{im}$$ with the highest value
4. fit distribution \$P(A) to the elites

With continuous inputs, a multi-variate normal distribution is a common choice for $$p(A)$$. In the discrete Case, Monte-Carlo tree search (§mcts) is typically used.

Using Derivatives

• Differentiable Dynamic Programming (DDP)
• LQR

Icon by Laymik from The Noun Project. Website built with ♥ with Org-mode, Hugo, and Netlify.