Algorithmic Balancing of Familiarity, Similarity, & Discovery in Music Recommendations
GPT-2 Preference Learning for Music Generation § Bradley-Terry Preference Learning
Statistical Notes § Dealing With All-Or-Nothing Unreliability of Data
Computational mechanisms of curiosity and goal-directed exploration
Long-Term Value of Exploration: Measurements, Findings and Algorithms
(More) Efficient Reinforcement Learning via Posterior Sampling