Bibliography (275):

  1. RNN Metadata for Mimicking Author Style

  2. Poems

  3. GPT-3 Creative Fiction

  4. twdne#text

    [Transclude the forward-link's context]

  5. A Very Unlikely Chess Game

  6. Update: Upgrading to 1.5B GPT-2, and adding 22 new subreddit-bots

  7. GPT-3: Language Models are Few-Shot Learners

  8. Better Language Models and Their Implications

  9. https://research.google/blog/transformer-a-novel-neural-network-architecture-for-language-understanding/

  10. The Illustrated Transformer

  11. The Illustrated GPT-2 (Visualizing Transformer Language Models)

  12. The Transformer—Attention Is All You Need.

  13. https://blog.floydhub.com/the-transformer-in-pytorch/

  14. https://e2eml.school/transformers.html

  15. Attention Is All You Need

  16. The Annotated Transformer

  17. Self-Attention with Relative Position Representations

  18. Character-Level Language Modeling with Deeper Self-Attention

  19. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

  20. Transformer-XL—Combining Transformers and RNNs Into a State-Of-The-Art Language Model

  21. Understanding BERT Transformer: Attention Isn’t All You Need

  22. Transformers are a very exciting family of machine learning architectures

  23. https://amaarora.github.io/2020/02/18/annotatedGPT2.html

  24. The Transformer Family: Attention and Self-Attention · Multi-Head Self-Attention · Transformer · Adaptive Computation Time (ACT) · Improved Attention Span: (Longer Attention Span (Transformer-XL) / Adaptive Attention Span / Localized Attention Span (Image Transformer)) · Less Time and Memory Cost: (Sparse Attention Matrix Factorization (Sparse Transformers) / Locality-Sensitive Hashing (Reformer)) · Make It Recurrent (Universal Transformer) · Stabilization for RL (GTrXL)

  25. The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

  26. Karpathy/minGPT: A Minimal PyTorch Re-Implementation of the OpenAI GPT (Generative Pretrained Transformer) Training

  27. RASP: Thinking Like Transformers

  28. Efficient Attention: Breaking The Quadratic Transformer Bottleneck

  29. ‘MLP NN’ directory

  30. GPT-1: Improving Language Understanding with Unsupervised Learning

  31. Language Modeling State-of-the-art leaderboards

  32. Language Models are Unsupervised Multitask Learners

  33. Humans Who Are Not Concentrating Are Not General Intelligences

  34. Gpt-2-Samples

  35. LM Explorer (alpha)

  36. GPT-2: 6-Month Follow-Up

  37. GPT-2: 1.5B Release

  38. OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too

  39. https://colab.research.google.com/drive/1BXry0kcm869-RVHHiY6NZmY9uBzbkf1Q

  40. GROVER: Defending Against Neural Fake News

  41. XLNet: Generalized Autoregressive Pretraining for Language Understanding

  42. MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism

  43. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  44. https://colab.research.google.com/drive/1-ROO7L09EupLFLQM-TWgDHa5-FIOdLLh

  45. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

  46. This Waifu Does Not Exist

  47. Talking to Myself or How I Trained GPT-2-1.5b for Rubber Ducking Using My Facebook Chat Data: Using Only Google Colab

  48. https://www.thisstorydoesnotexist.com/

  49. https://web.archive.org/web/20200110223938/https://stackroboflow.com/

  50. Howl

  51. An Eternal Howl

  52. https://www.reddit.com/r/slatestarcodex/comments/as8ke7/an_eternal_howl/

  53. GPT-2 Howl

  54. GPT-2 Writes a Shelley Poem

  55. GPT-2 As Step Toward General Intelligence

  56. First line of famous poems continued by GPT-2

  57. gpt-2-poetry

  58. Ask GPT-2

  59. Ask GPT-2

  60. FridAI: ‘Water, water, everywhere’, as read by Artificial Intelligence

  61. The Poetry Machine

  62. GPT-based Generation for Classical Chinese Poetry

  63. Three More GPT-2 Poems

  64. https://www.reddit.com/r/MachineLearning/comments/coc09l/p_these_lyrics_do_not_exist/

  65. Testing The Limits of GROVER The Neural Fake News Detector. Can It Write Fiction? Can It Write Riddles?

  66. https://www.reddit.com/r/SubSimulatorGPT2Meta/comments/ccvspt/update_experimenting_with_generating_hybrid/

  67. CTRL: A Conditional Transformer Language Model For Controllable Generation

  68. Conditional Transformer Language Model for Controllable Generation

  69. https://papergains.co/pdfs/Transformer_Poetry-978-1-7341647-0-1.pdf#page=3

  70. 345M-GPT-2 After James Wright: Can AI Generate Convincing Contemporary Poetry?

  71. GPT-2 AI Poetry Generation: Writing like Donne

  72. Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric

  73. How to Train It

  74. Nshepperd/gpt-2: Code for the Paper "Language Models Are Unsupervised Multitask Learners"

  75. ConnorJL/GPT2: An Implementation of Training for GPT-2, Supports TPUs

  76. Replicating GPT-2-1.5B

  77. Addendum: Evaluation of My Model

  78. A Corpus of Poetry from Project Gutenberg

  79. Dataset Search

  80. Poems from Poetryfoundation.org

  81. A Small Module Meant for Use in Text Generators That Lets You Filter Strings for Bad Words

  82. Success

  83. Unhandled Arguments Checked After Execution, Not Before

  84. The Curious Case of Neural Text Degeneration

  85. https://www.trentonbricken.com/Tail-Free-Sampling/

  86. The Unreasonable Effectiveness of Recurrent Neural Networks

  87. 2019-03-06-gwern-gpt2-poetry-projectgutenberg-network-519407.tar.xz

  88. 2019-03-06-gpt2-poetry-1000samples.txt

  89. https://x.com/theshawwn

  90. Kaggle: Your Home for Data Science

  91. rnn-metadata#inline-metadata-trick

    [Transclude the forward-link's context]

  92. 2019-10-18-Poetryfoundation-Formatted.txt

  93. 2019-10-17-117m-poetry-cleanprojectgutenberg-samples.txt

  94. 2019-10-19-117m-poetryfoundation-samples.txt

  95. 2019-10-19-gwern-gpt2-poetry-pgclean-117m.tar.xz

  96. 2019-03-06-gwern-gpt2-poetry-prefix-projectgutenberg-network-224474.tar.xz

  97. 2019-03-06-gpt2-poetry-prefix-1000samples.txt

  98. To a Skylark by Percy Bysshe Shelley

  99. Gwern’s AI-Generated Poetry

  100. Overview for Starspawn0

  101. Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

  102. Distributional Vectors Encode Referential Attributes

  103. Dynamic Word Embeddings for Evolving Semantic Discovery

  104. Verb Physics: Relative Physical Knowledge of Actions and Objects

  105. Language Models Represent Space and Time

  106. Language Encodes Geographical Information

  107. Grounding the Ungrounded: Estimating Locations of Unknown Place Names from Linguistic Associations and Grounded Representations

  108. Books by Pope, Alexander (Sorted by Popularity)

  109. 2019-03-16-gpt2-poetry-prefix-jabberwocky-100samples.txt

  110. The Jingle Book by Carolyn Wells

  111. https://openai.com/index/better-language-models/#update

  112. UniLM: Unified Language Model Pre-training for Natural Language Understanding and Generation

  113. Fitting Larger Networks into Memory: TLDR; We Release the Python/Tensorflow Package Openai/gradient-Checkpointing, That Lets You Fit 10× Larger Neural Nets into Memory at the Cost of an Additional 20% Computation Time

  114. Generating Long Sequences with Sparse Transformers

  115. Training Deep Nets with Sublinear Memory Cost

  116. Memory-Efficient Backpropagation through Time

  117. MuseNet: a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles

  118. Why Momentum Really Works

  119. Cyclical Learning Rates for Training Neural Networks

  120. SGDR: Stochastic Gradient Descent with Warm Restarts

  121. Averaging Weights Leads to Wider Optima and Better Generalization

  122. 2019-05-13-gwern-gpt2-poetry-345m.tar.xz

  123. 2019-05-13-gpt2-poetry-345m-5000samples.txt

  124. Reassuring

  125. This Is a Python Script As Described in XKCD #1263: ‘Reassuring’. It Generates Thousands of Reassuring Parables about Things Humans Are Better Than Computers at Every Second.

  126. 2019-05-24-gpt2-poetry-yeatssecondcoming-500completions.txt

  127. https://www.awanderingmind.blog/posts/2024-01-14-tao-te-ching-by-an-llm.html

  128. https://x.com/HW

  129. https://web.archive.org/web/20200209040154/https://decaut.org/situ/index.php/ttc-compilation/

  130. 2019-07-19-taotehching-ch1-1ksamples.txt

  131. 2019-07-21-gwern-gpt2-345m-taotehching-all.tar.xz

  132. 2019-07-21-taotehching-all-1ksamples.txt

  133. 2019-07-22-gpt2-345m-taotehching-all-ch181.tar.xz

  134. Release Strategies and the Social Impacts of Language Models

  135. Swarm Training: We Demonstrate a New Technique to Train ML Models Using Dozens of Independent TPUs.

  136. Shawwn/gpt-2: Code for the Paper "Language Models Are Unsupervised Multitask Learners"

  137. Pricing

  138. Danbooru2019 Is a Large-Scale Anime Image Database With 3.69m+ Images Annotated With 108m+ Tags; It Can Be Useful for Machine Learning Purposes such as Image Recognition and Generation.

    [Transclude the forward-link's context]

  139. The Google SRE Handbook: Chapter 4—Service Level Objectives

  140. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

  141. TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs

  142. GPT-1: Improving Language Understanding by Generative Pre-Training § Model specifications

  143. ftfy: fixes text for you

  144. Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

  145. U B U W E B :: Racter

  146. Language Models Are Unsupervised Multitask Learners § Experiments

  147. 2019-12-13-gpt21.5b-poetry-samples-topp090.txt

  148. 2019-12-15-gpt21.5b-poetry-samples-topp090.txt

  149. 2019-12-16-gpt21.5b-poetry-samples-topp080.txt

  150. 2019-12-18-gpt21.5b-poetry-samples-topp080.txt

  151. Greg Brockman: OpenAI and AGI

  152. Figure F.1: Four Uncurated Completions from a Context Suggesting the Model Compose a Poem in the Style of Wallace Stevens With the Title ‘Shadows on the Way’

  153. https://github.com/karpathy/char-rnn/issues/138

  154. https://news.ycombinator.com/item?id=21335120

  155. true_poetry: Poetry generator by GPT-2 with meter and rhyme constraints

  156. MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

  157. Neural Text Generation with Unlikelihood Training

  158. Do Massively Pretrained Language Models Make Better Storytellers?

  159. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  160. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

  161. Why Tool AIs Want to Be Agent AIs

  162. Deep reinforcement learning from human preferences

  163. CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms

  164. AlphaStar: Mastering the Real-Time Strategy Game StarCraft II

  165. https://www.reddit.com/r/slatestarcodex/comments/b1b47h/gwerns_aigenerated_poetry/

  166. https://news.ycombinator.com/item?id=19399467

  167. https://news.ycombinator.com/item?id=21456403

  168. Some Pretty Impressive Machine-Learning Generated Poetry Courtesy of GPT-2

  169. Hark! from Those Shadowy Depths Thy Voice / Mournfully Echoes

  170. On the Significance of Gwern’s Poem Generator

  171. OpenAI’s New Language AI Is Available to Try Yourself

  172. Generates Rhyming Poetry Using Huggingface GPT-2

  173. A Hundred Visions and Revisions

  174. RoBERTa: A Robustly Optimized BERT Pretraining Approach

  175. Simonepri/lm-Scorer: 📃Language Model Based Sentences Scoring Library

  176. How to Fine-Tune GPT-2 on Podcast Transcripts

  177. These WWDC Boxed Lunches Aren't Real

  178. https://web.archive.org/web/20220526054159/http://bkkaggle.github.io/blog/algpt2/2020/06/22/ALGPT2-part-1

  179. https://web.archive.org/web/20210131134147/https://bkkaggle.github.io/blog/algpt2/2020/07/17/ALGPT2-part-2.html

  180. The Average Fourth Grader Is a Better Poet Than…

  181. The First Sally (A), Or, Trurl’s Electronic Bard

  182. Seduced, Shaggy Samson Snored: The Fictional Machine That Generated Poems, and the Real People Who Had to Translate Them

  183. Ramon Lull’s Thinking Machine

  184. How to Build a State-Of-The-Art Conversational AI With Transfer Learning by Thomas Wolf

  185. Computer Generated Foundation

  186. https://www.reddit.com/r/SubSimulatorGPT2/comments/btfhks/what_is_rsubsimulatorgpt2/

  187. A Chinese Room Writes a Sequel to Blindsight

  188. How To Make Custom AI-Generated Text With GPT-2

  189. Minimaxir/gpt-2-Keyword-Generation: Method to Encode Text for GPT-2 to Generate Text Based on Provided Keywords

  190. Evaluation Metrics for Language Modeling

  191. Lessons Learned from Building an AI Writing App

  192. Excavate

  193. Introducing Aspects of Creativity in Automatic Poetry Generation

  194. Smart Vet: Autocompleting Sentences in Veterinary Medical Records

  195. Deepfake Bot Submissions to Federal Public Comment Websites Cannot Be Distinguished from Human Submissions

  196. This Word Does Not Exist [Github]

  197. https://towardsdatascience.com/how-to-fine-tune-gpt-2-so-you-can-generate-long-form-creative-writing-7a5ae1314a61

  198. This AI Poet Mastered Rhythm, Rhyme, and Natural Language to Write Like Shakespeare

  199. Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme

  200. Progressive Generation of Long Text

  201. AdapterHub - 625 Adapters for 71 Text Tasks and 97 Languages

  202. AdapterHub: A Framework for Adapting Transformers

  203. Collaborative Storytelling with Large-scale Neural Language Models

  204. Controllable Neural Text Generation

  205. This Article Provides an Overview of Recent Methods to Fine-Tune Large Pre-Trained Language Models

  206. Making Pre-trained Language Models Better Few-shot Learners

  207. Prefix-Tuning: Optimizing Continuous Prompts for Generation

  208. GPT Understands, Too

  209. The Power of Scale for Parameter-Efficient Prompt Tuning

  210. Entailment as Few-Shot Learner

  211. Controllable Generation from Pre-trained Language Models via Inverse Prompting

  212. https://gaotianyu.xyz/prompting/

  213. Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

  214. DART: Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

  215. PPT: Pre-trained Prompt Tuning for Few-shot Learning

  216. Towards a Unified View of Parameter-Efficient Transfer Learning

  217. 2019-12-18-skylion-archiveofourown-fanfics-textscrape.tar.xz

  218. https://archive.org/details/@entropy11235813

  219. 2020-01-14-gpt2-1558m-archiveofourownao3.tar.xz

  220. AI Dungeon 2

  221. 2020-02-03-gpt21.5b-archiveofourownao3-model-510427-samples-topp090.txt

  222. https://x.com/astraliteheart

  223. Expanding the Frontiers of AI Creativity

  224. 😇A PyTorch Implementation of the DeepMoji Model: State-Of-The-Art Deep Learning Model for Analyzing Sentiment, Emotion, Sarcasm Etc

  225. This Pony Does Not Exist

  226. Yzhou359/MakeItTalk

  227. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue.png

  228. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue-torchmojiemotionalvoicecontrol.jpg

  229. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue-voicedialogue.png

  230. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-gpuloadgraph.png

  231. End to End Agent Conversation Demo

  232. My Little Pony: Friendship Is Magic Fanfiction

  233. Library Genesis

  234. Best Science Fiction (3506 Books)

  235. The Best Fantasy Books

  236. https://x.com/me_irl/status/1217818112957014022

  237. 2020-02-03-gpt21.5b-videogamewalkthrough-model-174925-samples-topp090.txt

  238. OpenAI Text Generator GPT-2 Creates Video Game Walkthrough for ‘Most Tedious Game in History’

  239. https://x.com/theshawwn/status/1212156603140648961

  240. 2025-03-08-2019-12-18-shawnpresser-gpt-2-117m-rdota2.tar.xz