Optimal Control and Planning

How can we make decisions if we know the dynamics of the environment?

Stochastic optimization

Stochastic optimization for open-loop planning:

We wish to choose $a_{1}, \dots a_{T} = {argmax}_{a_{1}, \dots a_{T}} J (a_{1}, \dots, a_{T})$ for some objective $J$ .

Guess and Check

An extremely simple method, that’s parallelizable:

pick $A_{1}, \dots A_{N}$ from some distribution
choose $A_{i}$ based on $argmax J (A_{i})$ .

Cross-entropy Method (CEM)

pick $A_{1}, \dots A_{N}$ from some initial distribution $p (A)$
Evaluate $J (A_{1}), \dots J (A_{N})$
pick the elites $A_{i 1}, \dots A_{i m}$ with the highest value
fit distribution $P(A) to the elites

With continuous inputs, a multi-variate normal distribution is a common choice for $p (A)$ . In the discrete case, Monte Carlo Tree Search is typically used.

Using Derivatives

Differentiable Dynamic Programming (DDP)
LQR

Jethro's Braindump

Optimal Control and Planning

Stochastic optimization

Guess and Check

Cross-entropy Method (CEM)

Using Derivatives

Links to this note