Building a Large Annotated Corpus of English: The Penn Treebank
https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Recurrent Neural Network Based Language Model ยง Dynamic Evaluation