āGenerating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabiā, 2020-04-28 (; similar)ā :
Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage pre-established conventions to great effect, but playing in an ad-hoc setting requires agents to adapt to its partnerās strategies with no previous coordination. Evaluating an agent in this setting requires a diverse population of potential partners, but so far, the behavioral diversity of agents has not been considered in a systematic way.
This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate diverse populations for this purpose, and generates a population of diverse Hanabi agents using MAP-Elites. We also postulate that agents can benefit from a diverse population during training and implement a simple āmeta-strategyā for adapting to an agentās perceived behavioral niche.
We show this meta-strategy can work better than generalist strategies even outside the population it was trained with if its partnerās behavioral niche can be correctly inferred, but in practice a partnerās behavior depends and interferes with the meta-agentās own behavior, suggesting an avenue for future research in characterizing another agentās behavior during gameplay.