“Hierarchical Decision Making by Generating and Following Natural Language Instructions”, 2019-06-03 (; similar):
We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.
Rather than directly selecting micro-actions, our agent first generates a latent plan in natural language, which is then executed by a separate model. We introduce a challenging real-time strategy game environment in which the actions of a large number of units must be coordinated across long time scales. We gather a dataset of 76 thousand pairs of instructions and executions from human play, and train instructor and executor models.
Experiments show that models using natural language as a latent variable outperform models that directly imitate human actions. The compositional structure of language proves crucial to its effectiveness for action representation.
We also release our code, models and data.