Information-Theoretic Reinforcement Learning
Can we learn without any reward function at all?
Identities
- entropy
- mutual information
Information theoretic quantities in RL
- state marginal distribution of policy
- state marginal entropy of policy
- empowerment
Papers
- Skew-Fit (Pong et al., n.d.)
- Diversity is All your Need (Eysenbach et al., n.d.)
Bibliography
Eysenbach, Benjamin, Abhishek Gupta, Julian Ibarz, and Sergey Levine. n.d. “Diversity Is All You Need: Learning Skills without a Reward Function.” http://arxiv.org/abs/1802.06070v6.
Pong, Vitchyr H., Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, and Sergey Levine. n.d. “Skew-Fit: State-Covering Self-Supervised Reinforcement Learning.” http://arxiv.org/abs/1903.03698v2.