“GPT-1: Improving Language Understanding With Unsupervised Learning”, OpenAI2018-06-11 (, ; backlinks; similar)⁠:

We’ve obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we’re also releasing.

Our approach is a combination of two existing ideas: transformers and unsupervised pre-training.

These results provide a convincing example that pairing supervised learning methods with unsupervised pre-training works very well; this is an idea that many have explored in the past, and we hope our result motivates further research into applying this idea on larger and more diverse datasets.