“T0: Multitask Prompted Training Enables Zero-Shot Task Generalization”, 2021-10-15 (; backlinks; similar):
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning?
To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder T5 model on this multitask mixture covering a wide variety of tasks.
This T0 model attains strong zero-shot performance on several standard datasets, often outperforming models 16× its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6× its size.
All prompts and trained models are available at GitHub.