From a text-generation perspective, the included demos were very impressive: the text is coherent over a long horizon, and grammatical syntax and punctuation are near-perfect.
[Sample of GPT-2-1.5b output from original OA announcement: the “anti-recycling argument” text sample.]
At the same time, the Python code which allowed anyone to download the model (albeit smaller versions out of concern the full model can be abused to mass-generate fake news) and the TensorFlow code to load the downloaded model and generate predictions was open-sourced on GitHub.
Neil Sheppard created a fork of OpenAI’s repo which contains additional code to allow finetuning the existing OpenAI model on custom datasets. A notebook was created soon after, which can be copied into Google Colaboratory and clones Sheppard’s repo to finetune GPT-2 backed by a free GPU. From there, the proliferation of GPT-2 generated text took off: researchers such as Gwern Branwen made GPT-2 Poetry and Janelle Shane made GPT-2 Dungeons and Dragons character bios.
I waited to see if anyone would make a tool to help streamline this finetuning and text generation workflow, a la textgenrnn which I had made for recurrent neural network-based text generation. Months later, no one did. So I did it myself. Enter gpt-2-simple, a Python package which wraps Sheppard’s finetuning code in a functional interface and adds many utilities for model management and generation control.
Thanks to gpt-2-simple and this Colaboratory Notebook, you can easily finetune GPT-2 on your own dataset with a simple function, and generate text to your own specifications!