Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
A Multi-Level Attention Model for Evidence-Based Fact Checking
FEVER: a large-scale dataset for Fact Extraction and VERification
BPEs: Neural Machine Translation of Rare Words with Subword Units
GPT-1: Improving Language Understanding by Generative Pre-Training ยง Model specifications