Jethro's Braindump

Optimal Control and Planning

How can we make decisions if we know the dynamics of the environment?

Stochastic optimization

Stochastic optimization for open-loop planning:

We wish to choose a1,aT=argmaxa1,aTJ(a1,,aT) for some objective J.

Guess and Check

An extremely simple method, that’s parallelizable:

  1. pick A1,AN from some distribution
  2. choose Ai based on argmaxJ(Ai).

Cross-entropy Method (CEM)

  1. pick A1,AN from some initial distribution p(A)
  2. Evaluate J(A1),J(AN)
  3. pick the elites Ai1,Aim with the highest value
  4. fit distribution $P(A) to the elites

With continuous inputs, a multi-variate normal distribution is a common choice for p(A). In the discrete case, Monte Carlo Tree Search is typically used.

Using Derivatives

  • Differentiable Dynamic Programming (DDP)
  • LQR