Options Framework

An option is defined as a tuple containing:

An initiation function (precondition)
An internal policy (behaviour)
A termination function (post-condition)

This helps put learning and planning algorithms at the same level of abstraction. (Stolle and Precup, n.d.)

Models vs Actions

models of actions consist of immediate reward and transition probability to next state
models of options consist of reward until termination, and (discounted) transition to termination state

They look a lot like value functions, and can use the TD error to train the model (Temporal Difference Learning).

Generalized Value Functions

Bibliography

Stolle, Martin, and Doina Precup. n.d. “Learning Options in Reinforcement Learning.” In International Symposium on Abstraction, Reformulation, and Approximation, 212–23. Springer.

Options Framework

Models vs Actions

Related

Bibliography