Bibliography (280):

  1. RNN Metadata for Mimicking Author Style

  2. Poems

  3. GPT-3 Creative Fiction

  4. twdne#text

    [Transclude the forward-link's context]

  5. A Very Unlikely Chess Game

  6. Update: Upgrading to 1.5B GPT-2, and adding 22 new subreddit-bots

  7. GPT-3: Language Models are Few-Shot Learners

  8. Better Language Models and Their Implications

  9. https://research.google/blog/transformer-a-novel-neural-network-architecture-for-language-understanding/

  10. The Illustrated Transformer

  11. The Illustrated GPT-2 (Visualizing Transformer Language Models)

  12. The Transformer—Attention Is All You Need.

  13. https://blog.floydhub.com/the-transformer-in-pytorch/

  14. https://e2eml.school/transformers.html

  15. Attention Is All You Need

  16. The Annotated Transformer

  17. Self-Attention with Relative Position Representations

  18. Character-Level Language Modeling with Deeper Self-Attention

  19. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

  20. Transformer-XL—Combining Transformers and RNNs Into a State-Of-The-Art Language Model

  21. Understanding BERT Transformer: Attention Isn’t All You Need

  22. Transformers are a very exciting family of machine learning architectures

  23. https://amaarora.github.io/2020/02/18/annotatedGPT2.html

  24. The Transformer Family: Attention and Self-Attention Multi-Head Self-Attention Transformer Adaptive Computation Time (ACT) Improved Attention Span: (Longer Attention Span (Transformer-XL) / Adaptive Attention Span / Localized Attention Span (Image Transformer)) Less Time and Memory Cost: (Sparse Attention Matrix Factorization (Sparse Transformers) / Locality-Sensitive Hashing (Reformer)) Make It Recurrent (Universal Transformer) Stabilization for RL (GTrXL)

  25. The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

  26. Karpathy/minGPT: A Minimal PyTorch Re-Implementation of the OpenAI GPT (Generative Pretrained Transformer) Training

  27. RASP: Thinking Like Transformers

  28. ‘self-attention’ directory

  29. ‘MLP NN’ directory

  30. GPT-1: Improving Language Understanding with Unsupervised Learning

  31. Language Modeling State-of-the-art leaderboards

  32. Language Models are Unsupervised Multitask Learners

  33. Humans Who Are Not Concentrating Are Not General Intelligences

  34. Gpt-2-Samples

  35. LM Explorer (alpha)

  36. GPT-2: 6-Month Follow-Up

  37. GPT-2: 1.5B Release

  38. OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too

  39. https://colab.research.google.com/drive/1BXry0kcm869-RVHHiY6NZmY9uBzbkf1Q

  40. GROVER: Defending Against Neural Fake News

  41. XLNet: Generalized Autoregressive Pretraining for Language Understanding

  42. MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism

  43. T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

  44. https://colab.research.google.com/drive/1-ROO7L09EupLFLQM-TWgDHa5-FIOdLLh

  45. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

  46. This Waifu Does Not Exist

  47. Talking to Myself or How I Trained GPT-2-1.5b for Rubber Ducking Using My Facebook Chat Data: Using Only Google Colab

  48. https://www.thisstorydoesnotexist.com/

  49. https://web.archive.org/web/20200110223938/https://stackroboflow.com/

  50. Howl

  51. An Eternal Howl

  52. https://www.reddit.com/r/slatestarcodex/comments/as8ke7/an_eternal_howl/

  53. GPT-2 Howl

  54. GPT-2 Writes a Shelley Poem

  55. GPT-2 As Step Toward General Intelligence

  56. First line of famous poems continued by GPT-2

  57. gpt-2-poetry

  58. Ask GPT-2

  59. Ask GPT-2

  60. FridAI: ‘Water, water, everywhere’, as read by Artificial Intelligence

  61. The Poetry Machine

  62. GPT-based Generation for Classical Chinese Poetry

  63. Three More GPT-2 Poems

  64. https://www.reddit.com/r/MachineLearning/comments/coc09l/p_these_lyrics_do_not_exist/

  65. Testing The Limits of GROVER The Neural Fake News Detector. Can It Write Fiction? Can It Write Riddles?

  66. https://www.reddit.com/r/SubSimulatorGPT2Meta/comments/ccvspt/update_experimenting_with_generating_hybrid/

  67. CTRL: A Conditional Transformer Language Model For Controllable Generation

  68. Conditional Transformer Language Model for Controllable Generation

  69. https://papergains.co/pdfs/Transformer_Poetry-978-1-7341647-0-1.pdf#page=3

  70. 345M-GPT-2 After James Wright: Can AI Generate Convincing Contemporary Poetry?

  71. GPT-2 AI Poetry Generation: Writing like Donne

  72. Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric

  73. How to Train It

  74. Nshepperd/gpt-2: Code for the Paper "Language Models Are Unsupervised Multitask Learners"

  75. ConnorJL/GPT2: An Implementation of Training for GPT-2, Supports TPUs

  76. Replicating GPT-2-1.5B

  77. Addendum: Evaluation of My Model

  78. A Corpus of Poetry from Project Gutenberg

  79. Dataset Search

  80. Poems from Poetryfoundation.org

  81. A Small Module Meant for Use in Text Generators That Lets You Filter Strings for Bad Words

  82. Success

  83. Unhandled Arguments Checked After Execution, Not Before

  84. The Curious Case of Neural Text Degeneration

  85. https://www.trentonbricken.com/Tail-Free-Sampling/

  86. The Unreasonable Effectiveness of Recurrent Neural Networks

  87. 2019-03-06-gwern-gpt2-poetry-projectgutenberg-network-519407.tar.xz

  88. 2019-03-06-gpt2-poetry-1000samples.txt

  89. https://x.com/theshawwn

  90. Kaggle: Your Home for Data Science

  91. rnn-metadata#inline-metadata-trick

    [Transclude the forward-link's context]

  92. 2019-10-18-Poetryfoundation-Formatted.txt

  93. 2019-10-17-117m-poetry-cleanprojectgutenberg-samples.txt

  94. 2019-10-19-117m-poetryfoundation-samples.txt

  95. 2019-10-19-gwern-gpt2-poetry-pgclean-117m.tar.xz

  96. 2019-03-06-gwern-gpt2-poetry-prefix-projectgutenberg-network-224474.tar.xz

  97. 2019-03-06-gpt2-poetry-prefix-1000samples.txt

  98. To a Skylark by Percy Bysshe Shelley

  99. Gwern’s AI-Generated Poetry

  100. Overview for Starspawn0

  101. Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings

  102. Distributional Vectors Encode Referential Attributes

  103. Dynamic Word Embeddings for Evolving Semantic Discovery

  104. Verb Physics: Relative Physical Knowledge of Actions and Objects

  105. Language Models Represent Space and Time

  106. Language Encodes Geographical Information

  107. Grounding the Ungrounded: Estimating Locations of Unknown Place Names from Linguistic Associations and Grounded Representations

  108. Books by Pope, Alexander (Sorted by Popularity)

  109. 2019-03-16-gpt2-poetry-prefix-jabberwocky-100samples.txt

  110. The Jingle Book by Carolyn Wells

  111. https://openai.com/index/better-language-models/#update

  112. UniLM: Unified Language Model Pre-training for Natural Language Understanding and Generation

  113. Fitting Larger Networks into Memory: TLDR; We Release the Python/Tensorflow Package Openai/gradient-Checkpointing, That Lets You Fit 10× Larger Neural Nets into Memory at the Cost of an Additional 20% Computation Time

  114. Generating Long Sequences with Sparse Transformers

  115. Training Deep Nets with Sublinear Memory Cost

  116. Memory-Efficient Backpropagation through Time

  117. MuseNet: a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles

  118. Why Momentum Really Works

  119. Cyclical Learning Rates for Training Neural Networks

  120. SGDR: Stochastic Gradient Descent with Warm Restarts

  121. Averaging Weights Leads to Wider Optima and Better Generalization

  122. 2019-05-13-gwern-gpt2-poetry-345m.tar.xz

  123. 2019-05-13-gpt2-poetry-345m-5000samples.txt

  124. Reassuring

  125. This Is a Python Script As Described in XKCD #1263: ‘Reassuring’. It Generates Thousands of Reassuring Parables about Things Humans Are Better Than Computers at Every Second.

  126. 2019-05-24-gpt2-poetry-yeatssecondcoming-500completions.txt

  127. https://www.awanderingmind.blog/posts/2024-01-14-tao-te-ching-by-an-llm.html

  128. https://x.com/HW

  129. https://web.archive.org/web/20200209040154/https://decaut.org/situ/index.php/ttc-compilation/

  130. 2019-07-19-taotehching-ch1-1ksamples.txt

  131. 2019-07-21-gwern-gpt2-345m-taotehching-all.tar.xz

  132. 2019-07-21-taotehching-all-1ksamples.txt

  133. 2019-07-22-gpt2-345m-taotehching-all-ch181.tar.xz

  134. Release Strategies and the Social Impacts of Language Models

  135. Swarm Training: We Demonstrate a New Technique to Train ML Models Using Dozens of Independent TPUs.

  136. 2020-02-09-gpt21.5b-poetry-model-500522-1msamples.txt

  137. 2019-12-13-gwern-gpt-2-1.5b-poetry-model-500522.tar.xz

  138. Shawwn/gpt-2: Code for the Paper "Language Models Are Unsupervised Multitask Learners"

  139. Pricing

  140. Danbooru2019 Is a Large-Scale Anime Image Database With 3.69m+ Images Annotated With 108m+ Tags; It Can Be Useful for Machine Learning Purposes such as Image Recognition and Generation.

    [Transclude the forward-link's context]

  141. The Google SRE Handbook: Chapter 4—Service Level Objectives

  142. HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

  143. TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs

  144. GPT-1: Improving Language Understanding by Generative Pre-Training § Model specifications

  145. ftfy: fixes text for you

  146. Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

  147. U B U W E B :: Racter

  148. Language Models Are Unsupervised Multitask Learners § Experiments

  149. 2019-12-13-gpt21.5b-poetry-samples-topp090.txt

  150. 2019-12-15-gpt21.5b-poetry-samples-topp090.txt

  151. 2019-12-16-gpt21.5b-poetry-samples-topp080.txt

  152. 2019-12-18-gpt21.5b-poetry-samples-topp080.txt

  153. Greg Brockman: OpenAI and AGI

  154. Figure F.1: Four Uncurated Completions from a Context Suggesting the Model Compose a Poem in the Style of Wallace Stevens With the Title ‘Shadows on the Way’

  155. https://github.com/karpathy/char-rnn/issues/138

  156. https://news.ycombinator.com/item?id=21335120

  157. true_poetry: Poetry generator by GPT-2 with meter and rhyme constraints

  158. MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

  159. Neural Text Generation with Unlikelihood Training

  160. Do Massively Pretrained Language Models Make Better Storytellers?

  161. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  162. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

  163. Why Tool AIs Want to Be Agent AIs

  164. Deep reinforcement learning from human preferences

  165. CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms

  166. AlphaStar: Mastering the Real-Time Strategy Game StarCraft II

  167. https://www.reddit.com/r/slatestarcodex/comments/b1b47h/gwerns_aigenerated_poetry/

  168. https://news.ycombinator.com/item?id=19399467

  169. https://news.ycombinator.com/item?id=21456403

  170. Some Pretty Impressive Machine-Learning Generated Poetry Courtesy of GPT-2

  171. Hark! from Those Shadowy Depths Thy Voice / Mournfully Echoes

  172. On the Significance of Gwern’s Poem Generator

  173. OpenAI’s New Language AI Is Available to Try Yourself

  174. Generates Rhyming Poetry Using Huggingface GPT-2

  175. A Hundred Visions and Revisions

  176. RoBERTa: A Robustly Optimized BERT Pretraining Approach

  177. Simonepri/lm-Scorer: 📃Language Model Based Sentences Scoring Library

  178. How to Fine-Tune GPT-2 on Podcast Transcripts

  179. These WWDC Boxed Lunches Aren't Real

  180. https://web.archive.org/web/20220526054159/http://bkkaggle.github.io/blog/algpt2/2020/06/22/ALGPT2-part-1

  181. https://web.archive.org/web/20210131134147/https://bkkaggle.github.io/blog/algpt2/2020/07/17/ALGPT2-part-2.html

  182. The Average Fourth Grader Is a Better Poet Than…

  183. The First Sally (A), Or, Trurl’s Electronic Bard

  184. Seduced, Shaggy Samson Snored: The Fictional Machine That Generated Poems, and the Real People Who Had to Translate Them

  185. Ramon Lull’s Thinking Machine

  186. How to Build a State-Of-The-Art Conversational AI With Transfer Learning by Thomas Wolf

  187. Computer Generated Foundation

  188. https://www.reddit.com/r/SubSimulatorGPT2/comments/btfhks/what_is_rsubsimulatorgpt2/

  189. A Chinese Room Writes a Sequel to Blindsight

  190. How To Make Custom AI-Generated Text With GPT-2

  191. Minimaxir/gpt-2-Keyword-Generation: Method to Encode Text for GPT-2 to Generate Text Based on Provided Keywords

  192. Evaluation Metrics for Language Modeling

  193. Lessons Learned from Building an AI Writing App

  194. Excavate

  195. Introducing Aspects of Creativity in Automatic Poetry Generation

  196. Smart Vet: Autocompleting Sentences in Veterinary Medical Records

  197. Deepfake Bot Submissions to Federal Public Comment Websites Cannot Be Distinguished from Human Submissions

  198. This Word Does Not Exist [Github]

  199. https://towardsdatascience.com/how-to-fine-tune-gpt-2-so-you-can-generate-long-form-creative-writing-7a5ae1314a61

  200. This AI Poet Mastered Rhythm, Rhyme, and Natural Language to Write Like Shakespeare

  201. Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme

  202. Progressive Generation of Long Text

  203. AdapterHub - 625 Adapters for 71 Text Tasks and 97 Languages

  204. AdapterHub: A Framework for Adapting Transformers

  205. Collaborative Storytelling with Large-scale Neural Language Models

  206. Controllable Neural Text Generation

  207. This Article Provides an Overview of Recent Methods to Fine-Tune Large Pre-Trained Language Models

  208. Making Pre-trained Language Models Better Few-shot Learners

  209. Prefix-Tuning: Optimizing Continuous Prompts for Generation

  210. GPT Understands, Too

  211. The Power of Scale for Parameter-Efficient Prompt Tuning

  212. Entailment as Few-Shot Learner

  213. Controllable Generation from Pre-trained Language Models via Inverse Prompting

  214. https://gaotianyu.xyz/prompting/

  215. Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

  216. DART: Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

  217. PPT: Pre-trained Prompt Tuning for Few-shot Learning

  218. Towards a Unified View of Parameter-Efficient Transfer Learning

  219. 2019-12-18-skylion-archiveofourown-fanfics-textscrape.tar.xz

  220. https://archive.org/details/@entropy11235813

  221. 2020-01-14-gpt2-1558m-archiveofourownao3.tar.xz

  222. AI Dungeon 2

  223. 2020-02-03-gpt21.5b-archiveofourownao3-model-510427-samples-topp090.txt

  224. https://x.com/astraliteheart

  225. Expanding the Frontiers of AI Creativity

  226. 😇A PyTorch Implementation of the DeepMoji Model: State-Of-The-Art Deep Learning Model for Analyzing Sentiment, Emotion, Sarcasm Etc

  227. This Pony Does Not Exist

  228. Yzhou359/MakeItTalk

  229. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue.png

  230. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue-torchmojiemotionalvoicecontrol.jpg

  231. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-twilightsparkledialogue-voicedialogue.png

  232. 2021-05-05-astraliteheart-purplesmartai-mylittleponygpt215b-gpuloadgraph.png

  233. End to End Agent Conversation Demo

  234. My Little Pony: Friendship Is Magic Fanfiction

  235. Library Genesis

  236. Best Science Fiction (3506 Books)

  237. The Best Fantasy Books

  238. 2020-08-20-astraliteheart-gpt215b-sffuberset.tar.xz

  239. 2021-03-14-astraliteheart-tts-mlp.tar.xz

  240. https://x.com/me_irl/status/1217818112957014022

  241. 2020-02-03-gpt21.5b-videogamewalkthrough-model-174925-samples-topp090.txt

  242. OpenAI Text Generator GPT-2 Creates Video Game Walkthrough for ‘Most Tedious Game in History’

  243. 2020-01-16-gpt-2-1558m-shawnpresser-videogamewalkthrough.tar.xz

  244. https://x.com/theshawwn/status/1212156603140648961

  245. 2025-03-08-2019-12-18-shawnpresser-gpt-2-117m-rdota2.tar.xz