- See Also
-
Links
- “Int-4 LLaMa Is Not Enough—Int-3 and Beyond: More Compression, Easier to Build Apps on LLMs That Run Locally”, Nolano.org 2023
- “Beyond the Pass Mark: the Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan”, 2023
- “Rewarding Chatbots for Real-World Engagement With Millions of Users”, Et Al 2023
- “BiLD: Big Little Transformer Decoder”, Et Al 2023
- “MarioGPT: Open-Ended Text2Level Generation through Large Language Models”, Et Al 2023
- “Is ChatGPT a General-Purpose Natural Language Processing Task Solver?”, Et Al 2023
- “Use GPT-3 Incorrectly: Reduce Costs 40× and Increase Speed by 5×”, 2023
- “OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, 2023
- “Co-Writing With Opinionated Language Models Affects Users’ Views”, Et Al 2023
- “In-Context Retrieval-Augmented Language Models”, Et Al 2023
- “Crawling the Internal Knowledge-Base of Language Models”, Et Al 2023
- “Big Tech Was Moving Cautiously on AI. Then Came ChatGPT. Google, Facebook and Microsoft Helped Build the Scaffolding of AI. Smaller Companies Are Taking It to the Masses, Forcing Big Tech to React.”, Et Al 2023
- “The inside Story of ChatGPT: How OpenAI Founder Sam Altman Built the World’s Hottest Technology With Billions from Microsoft”, 2023
- “Rock Guitar Tablature Generation via Natural Language Processing”, Casco-2023
- “GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities”, Et Al 2023
- “InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, Et Al 2023
- “GPT-3 Takes the Bar Exam”, II & 2022
- “A New Chat Bot Is A ‘Code Red’ For Google’s Search Business: A New Wave of Chat Bots like ChatGPT Use Artificial Intelligence That Could Reinvent or Even Replace the Traditional Internet Search Engine”, 2022
- “Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent As Meta-Optimizers”, Et Al 2022
- “Precise Zero-Shot Dense Retrieval without Relevance Labels”, Et Al 2022
- “Emergent Analogical Reasoning in Large Language Models”, Et Al 2022
- “Harvey, Which Uses AI to Answer Legal Questions, Lands Cash from OpenAI”, 2022
- “Interpreting Neural Networks through the Polytope Lens”, Et Al 2022
- “SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models”, Et Al 2022
- “InstructPix2Pix: Learning to Follow Image Editing Instructions”, Et Al 2022
- “Galactica: A Large Language Model for Science”, Et Al 2022
- “LMentry: A Language Model Benchmark of Elementary Language Tasks”, Et Al 2022
- “GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers”, Et Al 2022
- “Contrastive Search Is What You Need For Neural Text Generation”, 2022
- “Evaluating Parameter Efficient Learning for Generation”, Et Al 2022
- “BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining”, Et Al 2022
- “Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models”, Et Al 2022
- “MTEB: Massive Text Embedding Benchmark”, Et Al 2022
- “Foundation Transformers”, Et Al 2022
- “Ask Me Anything (AMA): A Simple Strategy for Prompting Language Models”, Et Al 2022
- “Deep Language Algorithms Predict Semantic Comprehension from Brain Activity”, Et Al 2022
- “Semantic Reconstruction of Continuous Language from Non-invasive Brain Recordings”, Et Al 2022
- “Generate rather than Retrieve (GenRead): Large Language Models Are Strong Context Generators”, Et Al 2022
- “Out of One, Many: Using Language Models to Simulate Human Samples”, Et Al 2022
- “Do Androids Laugh at Electric Sheep? Humor”Understanding” Benchmarks from The New Yorker Caption Contest”, Et Al 2022
- “FP8 Formats for Deep Learning”, Et Al 2022
- “What Does a Platypus Look Like? Generating Customized Prompts for Zero-shot Image Classification (CuPL)”, Et Al 2022
- “Petals: Collaborative Inference and Fine-tuning of Large Models”, Et Al 2022
- “Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned”, Et Al 2022
- “Using Large Language Models to Simulate Multiple Humans”, Et Al 2022
- “Effidit: Your AI Writing Assistant”, Et Al 2022
- “What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Et Al 2022
- “Correspondence between the Layered Structure of Deep Language Models and Temporal Structure of Natural Language Processing in the Human Brain”, Et Al 2022
- “Language Models Show Human-like Content Effects on Reasoning”, Et Al 2022
- “LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Et Al 2022
- “GODEL: Large-Scale Pre-Training for Goal-Directed Dialog”, Et Al 2022
- “Can Foundation Models Talk Causality?”, Et Al 2022
- “NOAH: Neural Prompt Search”, Et Al 2022
- “ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers”, Et Al 2022
- “FlashAttention: Fast and Memory-Efficient Exact Attention With IO-Awareness”, Et Al 2022
- “Quark: Controllable Text Generation With Reinforced Unlearning”, Et Al 2022
- “NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Et Al 2022
- “Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, Et Al 2022
- “RankGen: Improving Text Generation With Large Ranking Models”, Et Al 2022
- “OPT: Open Pre-trained Transformer Language Models”, Et Al 2022
- “Opal: Multimodal Image Generation for News Illustration”, Et Al 2022
- “What Language Model to Train If You Have One Million GPU Hours?”, Et Al 2022
- “WAVPROMPT: Towards Few-Shot Spoken Language Understanding With Frozen Language Models”, Et Al 2022
- “Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space”, Et Al 2022
- “Time Control: Language Modeling via Stochastic Processes”, Et Al 2022
- “Shared Computational Principles for Language Processing in Humans and Deep Language Models”, Et Al 2022
- “InstructGPT: Training Language Models to Follow Instructions With Human Feedback”, Et Al 2022
- “Vector-quantized Image Modeling With Improved VQGAN”, Et Al 2022
- “Quantifying and Alleviating Political Bias in Language Models”, Et Al 2022
- “Controllable Natural Language Generation With Contrastive Prefixes”, Et Al 2022
- “Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Et Al 2022
- “Brains and Algorithms Partially Converge in Natural Language Processing”, 2022
- “A Contrastive Framework for Neural Text Generation”, Et Al 2022
- “ROME: Locating and Editing Factual Associations in GPT”, Et Al 2022
- “InPars: Data Augmentation for Information Retrieval Using Large Language Models”, Et Al 2022
- “AdaPrompt: Adaptive Model Training for Prompt-based NLP”, Et Al 2022
- “Cedille: A Large Autoregressive French Language Model”, 2022
- “Data Scaling Laws in NMT: The Effect of Noise and Architecture”, Et Al 2022
- “LID: Pre-Trained Language Models for Interactive Decision-Making”, Et Al 2022
- “PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, Et Al 2022
- “Typical Decoding for Natural Language Generation”, Et Al 2022
- “Contracts in the Age of Smart Readers”, 2022
- “Can Wikipedia Help Offline Reinforcement Learning?”, Et Al 2022
- “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model”, Et Al 2022
- “Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, Et Al 2022
- “WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, Et Al 2022
- “Memory-assisted Prompt Editing to Improve GPT-3 After Deployment”, Et Al 2022
- “A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models”, Et Al 2022
- “CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”, Et Al 2022
- “The Defeat of the Winograd Schema Challenge”, Et Al 2022
- “Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution”, Et Al 2022
- “Amortized Noisy Channel Neural Machine Translation”, Et Al 2021
- “Learning to Prompt for Continual Learning”, Et Al 2021
- “Learning To Retrieve Prompts for In-Context Learning”, Et Al 2021
- “PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts”, Et Al 2021
- “Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases”, Et Al 2021
- “LMTurk: Few-Shot Learners As Crowdsourcing Workers”, Et Al 2021
- “Improving Language Models by Retrieving from Trillions of Tokens”, Et Al 2021
- “A General Language Assistant As a Laboratory for Alignment”, Et Al 2021
- “Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic”, Et Al 2021
- “Long-range and Hierarchical Language Predictions in Brains and Algorithms”, Et Al 2021
- “True Few-Shot Learning With Prompts—A Real-World Perspective”, Schick & 2021
- “Few-shot Named Entity Recognition With Cloze Questions”, Et Al 2021
- “Mapping Language Models to Grounded Conceptual Spaces”, 2021
- “ClipCap: CLIP Prefix for Image Captioning”, Et Al 2021
- “M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining”, Et Al 2021
- “Evaluating Distributional Distortion in Neural Language Modeling”, 2021
- “On Transferability of Prompt Tuning for Natural Language Understanding”, Et Al 2021
- “Attention Approximates Sparse Distributed Memory”, 2021
- “What Can a Generative Language Model Answer About a Passage?”, Summers-Et Al 2021
- “CLUES: Few-Shot Learning Evaluation in Natural Language Understanding”, Et Al 2021
- “An Explanation of In-context Learning As Implicit Bayesian Inference”, Et Al 2021
- “Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”, Et Al 2021
- “Fast Model Editing at Scale”, Et Al 2021
- “Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning”, Et Al 2021
- “A Few More Examples May Be Worth Billions of Parameters”, Et Al 2021
- “Towards a Unified View of Parameter-Efficient Transfer Learning”, Et Al 2021
- “Scaling Laws for Neural Machine Translation”, Et Al 2021
- “Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color”, Et Al 2021
- “What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers”, Et Al 2021
- “Medically Aware GPT-3 As a Data Generator for Medical Dialogue Summarization”, Et Al 2021
- “TruthfulQA: Measuring How Models Mimic Human Falsehoods”, Et Al 2021
- “General-Purpose Question-Answering With Macaw”, 2021
- “An Empirical Exploration in Quality Filtering of Text Data”, 2021
- “Want To Reduce Labeling Cost? GPT-3 Can Help”, Et Al 2021
- “Scarecrow: A Framework for Scrutinizing Machine Text”, Et Al 2021
- “Multimodal Few-Shot Learning With Frozen Language Models”, Et Al 2021
- “Cutting Down on Prompts and Parameters: Simple Few-Shot Learning With Language Models”, IV Et Al 2021
- “LoRA: Low-Rank Adaptation of Large Language Models”, Et Al 2021
- “Let the Algorithm Speak: How to Use Neural Networks for Automatic Item Generation in Psychological Scale Development”, Et Al 2021
- “RASP: Thinking Like Transformers”, Et Al 2021
- “GPT-J-6B: 6B JAX-Based Transformer”, EleutherAI 2021
- “LHOPT: A Generalizable Approach to Learning Optimizers”, Et Al 2021
- “Anthropic Raises $124 Million to Build More Reliable, General AI Systems”, 2021
- “ByT5: Towards a Token-free Future With Pre-trained Byte-to-byte Models”, Et Al 2021
- “A Hierarchy of Linguistic Predictions during Natural Language Comprehension”, Et Al 2021
- “Naver Unveils First ‘Hyperscale’ AI Platform”, 2021
- “Machine Learning Scaling”, 2021
- “Scaling Laws for Language Transfer Learning”, 2021
- “GPT Understands, Too”, Et Al 2021
- “How Many Data Points Is a Prompt Worth?”, 2021
- “Pretrained Transformers As Universal Computation Engines”, Et Al 2021
- “Language Models Have a Moral Dimension”, Et Al 2021
- “Learning Chess Blindfolded: Evaluating Language Models on State Tracking”, Et Al 2021
- “Investigating the Limitations of the Transformers With Simple Arithmetic Tasks”, Et Al 2021
- “Proof Artifact Co-training for Theorem Proving With Language Models”, Et Al 2021
- “MAUVE: Measuring the Gap Between Neural Text and Human Text Using Divergence Frontiers”, Et Al 2021
- “Scaling Laws for Transfer”, Et Al 2021
- “Apparently ‘What Ho’ Is a Corruption Of…”, 2021
- “Prefix-Tuning: Optimizing Continuous Prompts for Generation”, 2021
- “The Pile: An 800GB Dataset of Diverse Text for Language Modeling”, Et Al 2021
- “Process for Adapting Language Models to Society (PALMS) With Values-Targeted Datasets”, 2021
- “Bot-Adversarial Dialogue for Safe Conversational Agents”, Et Al 2021
- “Making Pre-trained Language Models Better Few-shot Learners”, Et Al 2020
- “Extracting Training Data from Large Language Models”, Et Al 2020
- “Thinking Ahead: Prediction in Context As a Keystone of Language in Humans and Machines”, Et Al 2020
- “CPM: A Large-scale Generative Chinese Pre-trained Language Model”, Et Al 2020
- “Scaling Laws for Autoregressive Generative Modeling”, Et Al 2020
- “L2L: Training Large Neural Networks With Constant Memory Using a New Execution Algorithm”, Et Al 2020
- “Interacting With GPT-2 to Generate Controlled and Believable Musical Sequences in ABC Notation”, Geerlings & Meroño-2020
- “Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries”, Et Al 2020
- “The Neural Architecture of Language: Integrative Reverse-engineering Converges on a Model for Predictive Processing”, Et Al 2020
- “RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text”, Et Al 2020
- “GPT-3: Its Nature, Scope, Limits, and Consequences”, 2020
- “A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation”, Et Al 2020
- “GeDi: Generative Discriminator Guided Sequence Generation”, Et Al 2020
- “Generative Language Modeling for Automated Theorem Proving”, 2020
- “MMLU: Measuring Massive Multitask Language Understanding”, Et Al 2020
- “Learning to Summarize from Human Feedback”, Et Al 2020
- “Generative Models Are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, Et Al 2020
- “Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, Et Al 2020
- “Aligning AI With Shared Human Values”, Et Al 2020
- “The Chess Transformer: Mastering Play Using Generative Language Models”, Et Al 2020
- “Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity”, Et Al 2020
- “Efficient Attention: Breaking The Quadratic Transformer Bottleneck”, 2020
- “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data”, 2020
- “Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, Et Al 2020
- “OpenAI API Beta Homepage”, OpenAI 2020
- “GPT-3: Language Models Are Few-Shot Learners”, Et Al 2020
- “The Scaling Hypothesis”, 2020
- “True_poetry: Poetry Generator by GPT-2 With Meter and Rhyme Constraints”, Summers-2020
- “Scaling Laws from the Data Manifold Dimension”, 2020
- “Trading Off Diversity and Quality in Natural Language Generation”, Et Al 2020
- “Unigram LM: Byte Pair Encoding Is Suboptimal for Language Model Pretraining”, 2020
- “OpenAI Text Generator GPT-2 Creates Video Game Walkthrough For ‘Most Tedious Game in History’”, 2020
- “Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, Et Al 2020
- “Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions”, 2020
- “Reducing Non-Normative Text Generation from Language Models”, Et Al 2020
- “Scaling Laws for Neural Language Models”, Et Al 2020
- “What Does BERT Dream Of? A Visual Investigation of Nightmares in Sesame Street”, 2020
- “Reformer: The Efficient Transformer”, Et Al 2020
- “Generative Language Modeling for Automated Theorem Proving § Experiments”, 2020 (page 11 Org Openai)
- “Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric”, 2020
- “Controlling Text Generation With Plug and Play Language Models”, Et Al 2019
- “Plug and Play Language Models: A Simple Approach to Controlled Text Generation”, Et Al 2019
- “AI Dungeon 2”, 2019
- “How Can We Know What Language Models Know?”, Et Al 2019
- “GPT-2: 1.5B Release”, Et Al 2019
- “Release Strategies and the Social Impacts of Language Models”, Et Al 2019
- “DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation”, Et Al 2019
- “GPT-2 Folk Music”, 2019
- “Fine-Tuning GPT-2 from Human Preferences § Bugs Can Optimize for Bad Behavior”, Et Al 2019
- “Fine-Tuning GPT-2 from Human Preferences”, Et Al 2019
- “Fine-Tuning Language Models from Human Preferences”, Et Al 2019
- “Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism”, Et Al 2019
- “Lm-human-preferences”, Et Al 2019
- “CTRL: A Conditional Transformer Language Model For Controllable Generation”, Et Al 2019
- “How To Make Custom AI-Generated Text With GPT-2”, 2019
- “Language Modelling State-of-the-art Leaderboards”, Paperswithcode.com 2019
- “Smaller, Faster, Cheaper, Lighter: Introducing DistilGPT, a Distilled Version of GPT”, 2019
- “OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too”, 2019
- “GPT-2: 6-Month Follow-Up”, OpenAI 2019
- “Universal Adversarial Triggers for Attacking and Analyzing NLP”, Et Al 2019
- “MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism”, ADLR 2019
- “Neural Text Generation With Unlikelihood Training”, Et Al 2019
- “Addendum: Evaluation of My Model”, 2019
- “Replicating GPT-2-1.5B”, 2019
- “GROVER: Defending Against Neural Fake News”, Et Al 2019
- “MuseNet: a Deep Neural Network That Can Generate 4-minute Musical Compositions With 10 Different Instruments, and Can Combine Styles from Country to Mozart to the Beatles”, 2019
- “Generative Modeling With Sparse Transformers: We’ve Developed the Sparse Transformer, a Deep Neural Network Which Sets New Records at Predicting What Comes next in a Sequence—whether Text, Images, or Sound. It Uses an Algorithmic Improvement of The Attention Mechanism to Extract Patterns from Sequences 30× Longer Than Possible Previously”, 2019
- “The Curious Case of Neural Text Degeneration”, Et Al 2019
- “Smart Vet: Autocompleting Sentences in Veterinary Medical Records”, 2019
- “LM Explorer (alpha)”, 2019
- “GPT-2 As Step Toward General Intelligence”, 2019
- “Better Language Models and Their Implications”, Et Al 2019
- “Language Models Are Unsupervised Multitask Learners”, Et Al 2019
- “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, Et Al 2019
- “Talk To Transformer”, 2019
- “Music Transformer: Generating Music With Long-Term Structure”, Et Al 2018
- “Universal Transformers”, Et Al 2018
- “Adversarial Reprogramming of Neural Networks”, Et Al 2018
- “GPT-1: Improving Language Understanding With Unsupervised Learning”, OpenAI 2018
- “GPT-1: Improving Language Understanding by Generative Pre-Training § Model Specifications”, Et Al 2018 (page 5)
- “GPT-1: Improving Language Understanding by Generative Pre-Training”, Et Al 2018
- “Deep Reinforcement Learning from Human Preferences § Appendix A.2: Atari”, Et Al 2017 (page 15 Org Openai)
- “Learning to Generate Reviews and Discovering Sentiment”, Et Al 2017
- “Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-part-ii-dc585af054cb”
- “Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-6516ff395ba3”
- “Computing Machinery And Intelligence”, 1950
- “Math: OpenAI API Can Do Some Math out of the Gate, but Most Math It Seems It Has to Learn. Many Times, the Numbers That It Spits out Are Just Random. However, including Different Priming Prompts Can Result in Decent Results.”
- “Design a Role-playing Game Using 200 Words or Less.”
- “AI Dungeon: Dragon Model Upgrade—You Can Now Play AI Dungeon With One of the Most Powerful AI Models in the World.”
- “Controlling GPT-3 With Logit Bias: How AI Dungeon Uses Logit Bias to Help Control GPT-3”
- “Introducing AI Dungeon Translate: AI Dungeon Players Can Now Translate Their Stories into Emojis by Just Clicking a Button. [ 🤔 💯 🤷♂️ 🤔 🤔 🤔 💯]”
- “Looking for Grammar in All the Right Places”
- “The AI Channels Project”
- “OpenAI API Alchemy: Summarization”
- “OpenAI API Alchemy: Emoji Storytelling 🤖”
- “OpenAI API Alchemy: Turn a Script into a Novel (and vice Versa)”
- “AI Am I? (The New Aesthetic)”
- “GPT-3: An AI That’s Eerily Good at Writing Almost Anything”
- “Elon Musk By Dr. Seuss (GPT-3)”
- “Teaching GPT-3 to Identify Nonsense”
- “Transformers As Variational Autoencoders”
- “Deep Learning for Assisting the Process of Music Composition (part 3)”
- “Using GPT-3 to Explain Jokes”
- “I’ve Been Testing the Largest of @OpenAI’s Models With AI Dungeon and Been Constantly Impressed at How Interesting and Dynamic the Characters Are, like This Queen, Long Thought to Be Dead, Hiding from Enemies and Not Happy about Me Prying into Her Personal Life.”
- “Homepage of Paul F. Christiano”, 2023
- “TensorFlow Research Cloud (TRC): Accelerate Your Cutting-edge Machine Learning Research With Free Cloud TPUs”, TRC 2023
- “Meditations on Moloch”
- “Humans Who Are Not Concentrating Are Not General Intelligences”
- “This Is the OpenAI API. It Makes Spookily Good Twitter Bots. 13⁄10 Would Retweet”
- “AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
- “Interpreting GPT: the Logit Lens”
- “A Robot Wrote This Entire Article. Are You Scared Yet, Human? We Asked GPT-3, OpenAI’s Powerful New Language Generator, to Write an Essay for Us from Scratch. The Assignment? To Convince Us Robots Come in Peace | For More about GPT-3 and How This Essay Was Written and Edited, Please Read Our Editor’s Note Below”
- Miscellaneous
- Link Bibliography
See Also
Links
“Int-4 LLaMa Is Not Enough—Int-3 and Beyond: More Compression, Easier to Build Apps on LLMs That Run Locally”, Nolano.org 2023
“Int-4 LLaMa is not enough—Int-3 and beyond: More compression, easier to build apps on LLMs that run locally”, 2023-03-13 ( ; similar; bibliography)
“Beyond the Pass Mark: the Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan”, 2023
“Beyond the Pass Mark: the Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan”, 2023-03-10 ( ; similar; bibliography)
“Rewarding Chatbots for Real-World Engagement With Millions of Users”, Et Al 2023
“Rewarding Chatbots for Real-World Engagement with Millions of Users”, 2023-03-10 ( ; similar)
“BiLD: Big Little Transformer Decoder”, Et Al 2023
“BiLD: Big Little Transformer Decoder”, 2023-02-15 ( ; similar)
“MarioGPT: Open-Ended Text2Level Generation through Large Language Models”, Et Al 2023
“MarioGPT: Open-Ended Text2Level Generation through Large Language Models”, 2023-02-12 ( ; similar; bibliography)
“Is ChatGPT a General-Purpose Natural Language Processing Task Solver?”, Et Al 2023
“Is ChatGPT a General-Purpose Natural Language Processing Task Solver?”, 2023-02-08 (similar; bibliography)
“Use GPT-3 Incorrectly: Reduce Costs 40× and Increase Speed by 5×”, 2023
“Use GPT-3 incorrectly: reduce costs 40× and increase speed by 5×”, 2023-02-06 ( ; backlinks; similar)
“OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, 2023
“OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, 2023-02-03 ( ; similar; bibliography)
“Co-Writing With Opinionated Language Models Affects Users’ Views”, Et Al 2023
“Co-Writing with Opinionated Language Models Affects Users’ Views”, 2023-02-01 ( ; backlinks; similar; bibliography)
“In-Context Retrieval-Augmented Language Models”, Et Al 2023
“In-Context Retrieval-Augmented Language Models”, 2023-01-31 ( ; similar)
“Crawling the Internal Knowledge-Base of Language Models”, Et Al 2023
“Crawling the Internal Knowledge-Base of Language Models”, 2023-01-30 ( ; similar)
“Big Tech Was Moving Cautiously on AI. Then Came ChatGPT. Google, Facebook and Microsoft Helped Build the Scaffolding of AI. Smaller Companies Are Taking It to the Masses, Forcing Big Tech to React.”, Et Al 2023
“Big Tech was moving cautiously on AI. Then came ChatGPT. Google, Facebook and Microsoft helped build the scaffolding of AI. Smaller companies are taking it to the masses, forcing Big Tech to react.”, 2023-01-27 ( ; similar)
“The inside Story of ChatGPT: How OpenAI Founder Sam Altman Built the World’s Hottest Technology With Billions from Microsoft”, 2023
“The inside story of ChatGPT: How OpenAI founder Sam Altman built the world’s hottest technology with billions from Microsoft”, 2023-01-25 ( ; backlinks; similar)
“Rock Guitar Tablature Generation via Natural Language Processing”, Casco-2023
“Rock Guitar Tablature Generation via Natural Language Processing”, 2023-01-12 ( ; similar)
“GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities”, Et Al 2023
“GPT-3 as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities”, 2023-01-11 ( ; similar; bibliography)
“InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, Et Al 2023
“InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, 2023-01-08 ( ; similar)
“GPT-3 Takes the Bar Exam”, II & 2022
“GPT-3 Takes the Bar Exam”, 2022-12-29 ( ; backlinks; similar; bibliography)
“A New Chat Bot Is A ‘Code Red’ For Google’s Search Business: A New Wave of Chat Bots like ChatGPT Use Artificial Intelligence That Could Reinvent or Even Replace the Traditional Internet Search Engine”, 2022
“A New Chat Bot Is a ‘Code Red’ for Google’s Search Business: A new wave of chat bots like ChatGPT use artificial intelligence that could reinvent or even replace the traditional internet search engine”, 2022-12-21 ( ; backlinks; similar; bibliography)
“Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent As Meta-Optimizers”, Et Al 2022
“Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers”, 2022-12-20 ( ; similar)
“Precise Zero-Shot Dense Retrieval without Relevance Labels”, Et Al 2022
“Precise Zero-Shot Dense Retrieval without Relevance Labels”, 2022-12-20 ( ; similar; bibliography)
“Emergent Analogical Reasoning in Large Language Models”, Et Al 2022
“Emergent Analogical Reasoning in Large Language Models”, 2022-12-19 ( ; similar)
“Harvey, Which Uses AI to Answer Legal Questions, Lands Cash from OpenAI”, 2022
“Harvey, which uses AI to answer legal questions, lands cash from OpenAI”, 2022-11-23 ( ; backlinks; similar; bibliography)
“Interpreting Neural Networks through the Polytope Lens”, Et Al 2022
“Interpreting Neural Networks through the Polytope Lens”, 2022-11-22 ( ; similar)
“SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models”, Et Al 2022
“SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models”, 2022-11-18 ( ; similar; bibliography)
“InstructPix2Pix: Learning to Follow Image Editing Instructions”, Et Al 2022
“InstructPix2Pix: Learning to Follow Image Editing Instructions”, 2022-11-17 ( ; similar; bibliography)
“Galactica: A Large Language Model for Science”, Et Al 2022
“Galactica: A Large Language Model for Science”, 2022-11-16 ( ; similar)
“LMentry: A Language Model Benchmark of Elementary Language Tasks”, Et Al 2022
“LMentry: A Language Model Benchmark of Elementary Language Tasks”, 2022-11-03 ( ; backlinks; similar)
“GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers”, Et Al 2022
“GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers”, 2022-10-31 ( ; backlinks; similar; bibliography)
“Contrastive Search Is What You Need For Neural Text Generation”, 2022
“Contrastive Search Is What You Need For Neural Text Generation”, 2022-10-25 ( ; similar; bibliography)
“Evaluating Parameter Efficient Learning for Generation”, Et Al 2022
“Evaluating Parameter Efficient Learning for Generation”, 2022-10-25 ( ; similar; bibliography)
“BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining”, Et Al 2022
“BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining”, 2022-10-19 (similar; bibliography)
“Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models”, Et Al 2022
“Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models”, 2022-10-18 ( ; similar; bibliography)
“MTEB: Massive Text Embedding Benchmark”, Et Al 2022
“MTEB: Massive Text Embedding Benchmark”, 2022-10-13 ( ; similar)
“Foundation Transformers”, Et Al 2022
“Foundation Transformers”, 2022-10-12 ( ; similar; bibliography)
“Ask Me Anything (AMA): A Simple Strategy for Prompting Language Models”, Et Al 2022
“Ask Me Anything (AMA): A simple strategy for prompting language models”, 2022-10-05 ( ; similar; bibliography)
“Deep Language Algorithms Predict Semantic Comprehension from Brain Activity”, Et Al 2022
“Deep language algorithms predict semantic comprehension from brain activity”, 2022-09-29 ( ; similar; bibliography)
“Semantic Reconstruction of Continuous Language from Non-invasive Brain Recordings”, Et Al 2022
“Semantic reconstruction of continuous language from non-invasive brain recordings”, 2022-09-29 ( ; similar)
“Generate rather than Retrieve (GenRead): Large Language Models Are Strong Context Generators”, Et Al 2022
“Generate rather than Retrieve (GenRead): Large Language Models are Strong Context Generators”, 2022-09-21 ( ; similar)
“Out of One, Many: Using Language Models to Simulate Human Samples”, Et Al 2022
“Out of One, Many: Using Language Models to Simulate Human Samples”, 2022-09-14 ( ; similar)
“Do Androids Laugh at Electric Sheep? Humor”Understanding” Benchmarks from The New Yorker Caption Contest”, Et Al 2022
“Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest”, 2022-09-13 ( ; similar)
“FP8 Formats for Deep Learning”, Et Al 2022
“FP8 Formats for Deep Learning”, 2022-09-12 ( ; similar)
“What Does a Platypus Look Like? Generating Customized Prompts for Zero-shot Image Classification (CuPL)”, Et Al 2022
“What does a platypus look like? Generating customized prompts for zero-shot image classification (CuPL)”, 2022-09-07 ( ; similar; bibliography)
“Petals: Collaborative Inference and Fine-tuning of Large Models”, Et Al 2022
“Petals: Collaborative Inference and Fine-tuning of Large Models”, 2022-09-02 ( ; similar)
“Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned”, Et Al 2022
“Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned”, 2022-08-25 ( ; similar; bibliography)
“Using Large Language Models to Simulate Multiple Humans”, Et Al 2022
“Using Large Language Models to Simulate Multiple Humans”, 2022-08-18 ( ; similar)
“Effidit: Your AI Writing Assistant”, Et Al 2022
“Effidit: Your AI Writing Assistant”, 2022-08-03 ( ; similar)
“What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Et Al 2022
“What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, 2022-08-01 ( ; backlinks; similar; bibliography)
“Correspondence between the Layered Structure of Deep Language Models and Temporal Structure of Natural Language Processing in the Human Brain”, Et Al 2022
“Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain”, 2022-07-25 ( ; similar)
“Language Models Show Human-like Content Effects on Reasoning”, Et Al 2022
“Language models show human-like content effects on reasoning”, 2022-07-14 ( ; similar)
“LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Et Al 2022
“LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action”, 2022-07-10 ( ; backlinks; similar; bibliography)
“GODEL: Large-Scale Pre-Training for Goal-Directed Dialog”, Et Al 2022
“GODEL: Large-Scale Pre-Training for Goal-Directed Dialog”, 2022-06-22 (similar)
“Can Foundation Models Talk Causality?”, Et Al 2022
“Can Foundation Models Talk Causality?”, 2022-06-14 ( ; similar)
“NOAH: Neural Prompt Search”, Et Al 2022
“NOAH: Neural Prompt Search”, 2022-06-09 ( ; similar)
“ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers”, Et Al 2022
“ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers”, 2022-06-04 ( ; similar; bibliography)
“FlashAttention: Fast and Memory-Efficient Exact Attention With IO-Awareness”, Et Al 2022
“FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”, 2022-05-27 ( ; backlinks; similar; bibliography)
“Quark: Controllable Text Generation With Reinforced Unlearning”, Et Al 2022
“Quark: Controllable Text Generation with Reinforced Unlearning”, 2022-05-26 ( ; similar)
“NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Et Al 2022
“NaturalProver: Grounded Mathematical Proof Generation with Language Models”, 2022-05-25 ( ; similar; bibliography)
“Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, Et Al 2022
“Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, 2022-05-22 ( ; backlinks; similar)
“RankGen: Improving Text Generation With Large Ranking Models”, Et Al 2022
“RankGen: Improving Text Generation with Large Ranking Models”, 2022-05-19 ( ; similar)
“OPT: Open Pre-trained Transformer Language Models”, Et Al 2022
“OPT: Open Pre-trained Transformer Language Models”, 2022-05-02 (similar)
“Opal: Multimodal Image Generation for News Illustration”, Et Al 2022
“Opal: Multimodal Image Generation for News Illustration”, 2022-04-19 ( ; similar)
“What Language Model to Train If You Have One Million GPU Hours?”, Et Al 2022
“What Language Model to Train if You Have One Million GPU Hours?”, 2022-04-11 ( ; similar)
“WAVPROMPT: Towards Few-Shot Spoken Language Understanding With Frozen Language Models”, Et Al 2022
“WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models”, 2022-03-29 (similar)
“Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space”, Et Al 2022
“Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space”, 2022-03-28 (similar)
“Time Control: Language Modeling via Stochastic Processes”, Et Al 2022
“Time Control: Language modeling via stochastic processes”, 2022-03-21 ( ; similar)
“Shared Computational Principles for Language Processing in Humans and Deep Language Models”, Et Al 2022
“Shared computational principles for language processing in humans and deep language models”, 2022-03-07 ( ; similar; bibliography)
“InstructGPT: Training Language Models to Follow Instructions With Human Feedback”, Et Al 2022
“InstructGPT: Training language models to follow instructions with human feedback”, 2022-03-04 ( ; similar)
“Vector-quantized Image Modeling With Improved VQGAN”, Et Al 2022
“Vector-quantized Image Modeling with Improved VQGAN”, 2022-03-01 ( ; similar; bibliography)
“Quantifying and Alleviating Political Bias in Language Models”, Et Al 2022
“Quantifying and alleviating political bias in language models”, 2022-03-01 ( ; similar; bibliography)
“Controllable Natural Language Generation With Contrastive Prefixes”, Et Al 2022
“Controllable Natural Language Generation with Contrastive Prefixes”, 2022-02-27 ( ; similar)
“Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Et Al 2022
“Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, 2022-02-25 ( ; similar; bibliography)
“Brains and Algorithms Partially Converge in Natural Language Processing”, 2022
“Brains and algorithms partially converge in natural language processing”, 2022-02-16 ( ; similar; bibliography)
“A Contrastive Framework for Neural Text Generation”, Et Al 2022
“A Contrastive Framework for Neural Text Generation”, 2022-02-13 ( ; backlinks; similar)
“ROME: Locating and Editing Factual Associations in GPT”, Et Al 2022
“ROME: Locating and Editing Factual Associations in GPT”, 2022-02-10 ( ; similar)
“InPars: Data Augmentation for Information Retrieval Using Large Language Models”, Et Al 2022
“InPars: Data Augmentation for Information Retrieval using Large Language Models”, 2022-02-10 ( ; backlinks; similar)
“AdaPrompt: Adaptive Model Training for Prompt-based NLP”, Et Al 2022
“AdaPrompt: Adaptive Model Training for Prompt-based NLP”, 2022-02-10 (similar)
“Cedille: A Large Autoregressive French Language Model”, 2022
“Cedille: A large autoregressive French language model”, 2022-02-07 (similar)
“Data Scaling Laws in NMT: The Effect of Noise and Architecture”, Et Al 2022
“Data Scaling Laws in NMT: The Effect of Noise and Architecture”, 2022-02-04 ( ; similar)
“LID: Pre-Trained Language Models for Interactive Decision-Making”, Et Al 2022
“LID: Pre-Trained Language Models for Interactive Decision-Making”, 2022-02-03 ( ; backlinks; similar)
“PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, Et Al 2022
“PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, 2022-02-02 ( ; similar)
“Typical Decoding for Natural Language Generation”, Et Al 2022
“Typical Decoding for Natural Language Generation”, 2022-02-01 ( ; backlinks; similar)
“Contracts in the Age of Smart Readers”, 2022
“Contracts in the Age of Smart Readers”, 2022-02 ( ; backlinks; similar)
“Can Wikipedia Help Offline Reinforcement Learning?”, Et Al 2022
“Can Wikipedia Help Offline Reinforcement Learning?”, 2022-01-28 ( ; similar)
“Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model”, Et Al 2022
“Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model”, 2022-01-28 ( ; similar; bibliography)
“Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, Et Al 2022
“Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, 2022-01-18 ( ; similar)
“WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, Et Al 2022
“WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, 2022-01-16 ( ; similar; bibliography)
“Memory-assisted Prompt Editing to Improve GPT-3 After Deployment”, Et Al 2022
“Memory-assisted prompt editing to improve GPT-3 after deployment”, 2022-01-16 ( ; similar)
“A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models”, Et Al 2022
“A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models”, 2022-01-14 ( ; similar)
“CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”, Et Al 2022
“CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”, 2022-01-14 ( ; similar; bibliography)
“The Defeat of the Winograd Schema Challenge”, Et Al 2022
“The Defeat of the Winograd Schema Challenge”, 2022-01-07 ( ; backlinks)
“Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution”, Et Al 2022
“Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution”, 2022-01 ( ; similar; bibliography)
“Amortized Noisy Channel Neural Machine Translation”, Et Al 2021
“Amortized Noisy Channel Neural Machine Translation”, 2021-12-16 ( ; similar)
“Learning to Prompt for Continual Learning”, Et Al 2021
“Learning to Prompt for Continual Learning”, 2021-12-16 ( ; similar)
“Learning To Retrieve Prompts for In-Context Learning”, Et Al 2021
“Learning To Retrieve Prompts for In-Context Learning”, 2021-12-16 ( ; similar)
“PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts”, Et Al 2021
“PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts”, 2021-12-15 ( ; similar)
“Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases”, Et Al 2021
“Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases”, 2021-12-15 ( ; similar)
“LMTurk: Few-Shot Learners As Crowdsourcing Workers”, Et Al 2021
“LMTurk: Few-Shot Learners as Crowdsourcing Workers”, 2021-12-14 (similar)
“Improving Language Models by Retrieving from Trillions of Tokens”, Et Al 2021
“Improving language models by retrieving from trillions of tokens”, 2021-12-08 ( ; similar; bibliography)
“A General Language Assistant As a Laboratory for Alignment”, Et Al 2021
“A General Language Assistant as a Laboratory for Alignment”, 2021-12-01 ( ; similar; bibliography)
“Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic”, Et Al 2021
“Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic”, 2021-11-29 ( ; similar)
“Long-range and Hierarchical Language Predictions in Brains and Algorithms”, Et Al 2021
“Long-range and hierarchical language predictions in brains and algorithms”, 2021-11-28 ( ; similar)
“True Few-Shot Learning With Prompts—A Real-World Perspective”, Schick & 2021
“True Few-Shot Learning with Prompts—A Real-World Perspective”, 2021-11-26 (similar)
“Few-shot Named Entity Recognition With Cloze Questions”, Et Al 2021
“Few-shot Named Entity Recognition with Cloze Questions”, 2021-11-24 (similar)
“Mapping Language Models to Grounded Conceptual Spaces”, 2021
“Mapping Language Models to Grounded Conceptual Spaces”, 2021-11-18 ( ; backlinks; similar; bibliography)
“ClipCap: CLIP Prefix for Image Captioning”, Et Al 2021
“ClipCap: CLIP Prefix for Image Captioning”, 2021-11-18 ( ; similar; bibliography)
“M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining”, Et Al 2021
“M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining”, 2021-11-17 ( ; similar)
“Evaluating Distributional Distortion in Neural Language Modeling”, 2021
“Evaluating Distributional Distortion in Neural Language Modeling”, 2021-11-16 ( ; similar)
“On Transferability of Prompt Tuning for Natural Language Understanding”, Et Al 2021
“On Transferability of Prompt Tuning for Natural Language Understanding”, 2021-11-12 (similar)
“Attention Approximates Sparse Distributed Memory”, 2021
“Attention Approximates Sparse Distributed Memory”, 2021-11-10 ( ; similar)
“What Can a Generative Language Model Answer About a Passage?”, Summers-Et Al 2021
“What Can a Generative Language Model Answer About a Passage?”, 2021-11-10 (similar; bibliography)
“CLUES: Few-Shot Learning Evaluation in Natural Language Understanding”, Et Al 2021
“CLUES: Few-Shot Learning Evaluation in Natural Language Understanding”, 2021-11-04 (similar; bibliography)
“An Explanation of In-context Learning As Implicit Bayesian Inference”, Et Al 2021
“An Explanation of In-context Learning as Implicit Bayesian Inference”, 2021-11-03 ( ; backlinks; similar)
“Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”, Et Al 2021
“Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”, 2021-11-01 (similar)
“Fast Model Editing at Scale”, Et Al 2021
“Fast Model Editing at Scale”, 2021-10-21 ( ; similar; bibliography)
“Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning”, Et Al 2021
“Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning”, 2021-10-10 ( ; similar)
“A Few More Examples May Be Worth Billions of Parameters”, Et Al 2021
“A Few More Examples May Be Worth Billions of Parameters”, 2021-10-08 (similar)
“Towards a Unified View of Parameter-Efficient Transfer Learning”, Et Al 2021
“Towards a Unified View of Parameter-Efficient Transfer Learning”, 2021-10-08 (backlinks; similar)
“Scaling Laws for Neural Machine Translation”, Et Al 2021
“Scaling Laws for Neural Machine Translation”, 2021-09-16 ( ; similar)
“Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color”, Et Al 2021
“Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color”, 2021-09-13 ( ; backlinks; similar)
“What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers”, Et Al 2021
“What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers”, 2021-09-10 ( ; similar)
“Medically Aware GPT-3 As a Data Generator for Medical Dialogue Summarization”, Et Al 2021
“Medically Aware GPT-3 as a Data Generator for Medical Dialogue Summarization”, 2021-09-09 (similar)
“TruthfulQA: Measuring How Models Mimic Human Falsehoods”, Et Al 2021
“TruthfulQA: Measuring How Models Mimic Human Falsehoods”, 2021-09-08 ( ; backlinks; similar; bibliography)
“General-Purpose Question-Answering With Macaw”, 2021
“General-Purpose Question-Answering with Macaw”, 2021-09-06 ( ; similar; bibliography)
“An Empirical Exploration in Quality Filtering of Text Data”, 2021
“An Empirical Exploration in Quality Filtering of Text Data”, 2021-09-02 ( ; similar)
“Want To Reduce Labeling Cost? GPT-3 Can Help”, Et Al 2021
“Want To Reduce Labeling Cost? GPT-3 Can Help”, 2021-08-30 ( ; similar)
“Scarecrow: A Framework for Scrutinizing Machine Text”, Et Al 2021
“Scarecrow: A Framework for Scrutinizing Machine Text”, 2021-07-02 ( ; similar; bibliography)
“Multimodal Few-Shot Learning With Frozen Language Models”, Et Al 2021
“Multimodal Few-Shot Learning with Frozen Language Models”, 2021-06-25 ( ; similar)
“Cutting Down on Prompts and Parameters: Simple Few-Shot Learning With Language Models”, IV Et Al 2021
“Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models”, 2021-06-24 (backlinks; similar)
“LoRA: Low-Rank Adaptation of Large Language Models”, Et Al 2021
“LoRA: Low-Rank Adaptation of Large Language Models”, 2021-06-17 ( ; similar; bibliography)
“Let the Algorithm Speak: How to Use Neural Networks for Automatic Item Generation in Psychological Scale Development”, Et Al 2021
“Let the Algorithm Speak: How to Use Neural Networks for Automatic Item Generation in Psychological Scale Development”, 2021-06-15 ( ; similar; bibliography)
“RASP: Thinking Like Transformers”, Et Al 2021
“RASP: Thinking Like Transformers”, 2021-06-13 ( ; backlinks; similar; bibliography)
“GPT-J-6B: 6B JAX-Based Transformer”, EleutherAI 2021
“GPT-J-6B: 6B JAX-Based Transformer”, 2021-06-08 (backlinks; similar; bibliography)
“LHOPT: A Generalizable Approach to Learning Optimizers”, Et Al 2021
“LHOPT: A Generalizable Approach to Learning Optimizers”, 2021-06-02 ( ; similar; bibliography)
“Anthropic Raises $124 Million to Build More Reliable, General AI Systems”, 2021
“Anthropic raises $124 million to build more reliable, general AI systems”, 2021-05-28 ( ; similar)
“ByT5: Towards a Token-free Future With Pre-trained Byte-to-byte Models”, Et Al 2021
“ByT5: Towards a token-free future with pre-trained byte-to-byte models”, 2021-05-28 ( ; similar; bibliography)
“A Hierarchy of Linguistic Predictions during Natural Language Comprehension”, Et Al 2021
“A hierarchy of linguistic predictions during natural language comprehension”, 2021-05-27 ( ; similar)
“Naver Unveils First ‘Hyperscale’ AI Platform”, 2021
“Naver unveils first ‘hyperscale’ AI platform”, 2021-05-25 ( ; similar; bibliography)
“Machine Learning Scaling”, 2021
“Machine Learning Scaling”, 2021-04-24 ( ; backlinks; bibliography)
“Scaling Laws for Language Transfer Learning”, 2021
“Scaling Laws for Language Transfer Learning”, 2021-04-11 ( ; similar)
“GPT Understands, Too”, Et Al 2021
“GPT Understands, Too”, 2021-03-18 (backlinks; similar)
“How Many Data Points Is a Prompt Worth?”, 2021
“How Many Data Points is a Prompt Worth?”, 2021-03-15 (backlinks; similar)
“Pretrained Transformers As Universal Computation Engines”, Et Al 2021
“Pretrained Transformers as Universal Computation Engines”, 2021-03-09 ( ; backlinks; similar)
“Language Models Have a Moral Dimension”, Et Al 2021
“Language Models have a Moral Dimension”, 2021-03-08 ( ; backlinks; similar)
“Learning Chess Blindfolded: Evaluating Language Models on State Tracking”, Et Al 2021
“Learning Chess Blindfolded: Evaluating Language Models on State Tracking”, 2021-02-26 ( ; backlinks; similar)
“Investigating the Limitations of the Transformers With Simple Arithmetic Tasks”, Et Al 2021
“Investigating the Limitations of the Transformers with Simple Arithmetic Tasks”, 2021-02-25 ( ; backlinks; similar; bibliography)
“Proof Artifact Co-training for Theorem Proving With Language Models”, Et Al 2021
“Proof Artifact Co-training for Theorem Proving with Language Models”, 2021-02-11 ( ; backlinks; similar)
“MAUVE: Measuring the Gap Between Neural Text and Human Text Using Divergence Frontiers”, Et Al 2021
“MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers”, 2021-02-02 ( ; similar)
“Scaling Laws for Transfer”, Et Al 2021
“Scaling Laws for Transfer”, 2021-02-02 ( ; similar)
“Apparently ‘What Ho’ Is a Corruption Of…”, 2021
“Apparently ‘what ho’ is a corruption of…”, 2021-01-14 (backlinks)
“Prefix-Tuning: Optimizing Continuous Prompts for Generation”, 2021
“Prefix-Tuning: Optimizing Continuous Prompts for Generation”, 2021 ( ; backlinks; similar; bibliography)
“The Pile: An 800GB Dataset of Diverse Text for Language Modeling”, Et Al 2021
“The Pile: An 800GB Dataset of Diverse Text for Language Modeling”, 2021 ( ; similar; bibliography)
“Process for Adapting Language Models to Society (PALMS) With Values-Targeted Datasets”, 2021
“Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets”, 2021 ( ; similar)
“Bot-Adversarial Dialogue for Safe Conversational Agents”, Et Al 2021
“Bot-Adversarial Dialogue for Safe Conversational Agents”, 2021 ( ; similar; bibliography)
“Making Pre-trained Language Models Better Few-shot Learners”, Et Al 2020
“Making Pre-trained Language Models Better Few-shot Learners”, 2020-12-31 (backlinks; similar)
“Extracting Training Data from Large Language Models”, Et Al 2020
“Extracting Training Data from Large Language Models”, 2020-12-14 (backlinks; similar)
“Thinking Ahead: Prediction in Context As a Keystone of Language in Humans and Machines”, Et Al 2020
“Thinking ahead: prediction in context as a keystone of language in humans and machines”, 2020-12-03 ( ; similar)
“CPM: A Large-scale Generative Chinese Pre-trained Language Model”, Et Al 2020
“CPM: A Large-scale Generative Chinese Pre-trained Language Model”, 2020-12-01 ( ; backlinks; similar)
“Scaling Laws for Autoregressive Generative Modeling”, Et Al 2020
“Scaling Laws for Autoregressive Generative Modeling”, 2020-10-28 ( ; similar; bibliography)
“L2L: Training Large Neural Networks With Constant Memory Using a New Execution Algorithm”, Et Al 2020
“L2L: Training Large Neural Networks with Constant Memory using a New Execution Algorithm”, 2020-10-16 ( ; similar)
“Interacting With GPT-2 to Generate Controlled and Believable Musical Sequences in ABC Notation”, Geerlings & Meroño-2020
“Interacting with GPT-2 to Generate Controlled and Believable Musical Sequences in ABC Notation”, 2020-10-16 ( ; backlinks; similar)
“Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries”, Et Al 2020
“Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries”, 2020-10-14 ( ; backlinks; similar)
“The Neural Architecture of Language: Integrative Reverse-engineering Converges on a Model for Predictive Processing”, Et Al 2020
“The neural architecture of language: Integrative reverse-engineering converges on a model for predictive processing”, 2020-10-09 ( ; backlinks; similar)
“RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text”, Et Al 2020
“RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text”, 2020-10-06 (backlinks; similar)
“GPT-3: Its Nature, Scope, Limits, and Consequences”, 2020
“GPT-3: Its Nature, Scope, Limits, and Consequences”, 2020-10-01 (backlinks; similar)
“A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation”, Et Al 2020
“A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation”, 2020-09-15 ( ; backlinks; similar)
“GeDi: Generative Discriminator Guided Sequence Generation”, Et Al 2020
“GeDi: Generative Discriminator Guided Sequence Generation”, 2020-09-14 ( ; similar)
“Generative Language Modeling for Automated Theorem Proving”, 2020
“Generative Language Modeling for Automated Theorem Proving”, 2020-09-07 ( ; similar; bibliography)
“MMLU: Measuring Massive Multitask Language Understanding”, Et Al 2020
“MMLU: Measuring Massive Multitask Language Understanding”, 2020-09-07 ( ; backlinks; similar; bibliography)
“Learning to Summarize from Human Feedback”, Et Al 2020
“Learning to summarize from human feedback”, 2020-09-02 ( ; similar)
“Generative Models Are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, Et Al 2020
“Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study”, 2020-08-17 ( ; similar)
“Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, Et Al 2020
“Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size”, 2020-08-16 ( ; backlinks; similar)
“Aligning AI With Shared Human Values”, Et Al 2020
“Aligning AI With Shared Human Values”, 2020-08-05 ( ; backlinks; similar)
“The Chess Transformer: Mastering Play Using Generative Language Models”, Et Al 2020
“The Chess Transformer: Mastering Play using Generative Language Models”, 2020-08-02 ( ; backlinks; similar)
“Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity”, Et Al 2020
“Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity”, 2020-07-29 ( ; backlinks; similar)
“Efficient Attention: Breaking The Quadratic Transformer Bottleneck”, 2020
“Efficient Attention: Breaking The Quadratic Transformer Bottleneck”, 2020-07-25 ( ; backlinks; similar; bibliography)
“Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data”, 2020
“Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data”, 2020-07 ( ; backlinks; similar)
“Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, Et Al 2020
“Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention”, 2020-06-29 ( ; backlinks; similar)
“OpenAI API Beta Homepage”, OpenAI 2020
“OpenAI API Beta homepage”, 2020-06-11 (backlinks; similar)
“GPT-3: Language Models Are Few-Shot Learners”, Et Al 2020
“GPT-3: Language Models are Few-Shot Learners”, 2020-05-28 ( ; similar)
“The Scaling Hypothesis”, 2020
“The Scaling Hypothesis”, 2020-05-28 ( ; backlinks; similar; bibliography)
“True_poetry: Poetry Generator by GPT-2 With Meter and Rhyme Constraints”, Summers-2020
“true_poetry: Poetry generator by GPT-2 with meter and rhyme constraints”, 2020-05-08 ( ; backlinks; similar)
“Trading Off Diversity and Quality in Natural Language Generation”, Et Al 2020
“Trading Off Diversity and Quality in Natural Language Generation”, 2020-04-22 ( ; similar)
“Unigram LM: Byte Pair Encoding Is Suboptimal for Language Model Pretraining”, 2020
“Unigram LM: Byte Pair Encoding is Suboptimal for Language Model Pretraining”, 2020-04-07 ( ; backlinks; similar)
“OpenAI Text Generator GPT-2 Creates Video Game Walkthrough For ‘Most Tedious Game in History’”, 2020
“OpenAI Text Generator GPT-2 Creates Video Game Walkthrough for 'Most Tedious Game in History'”, 2020-02-20 (backlinks; similar; bibliography)
“Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, Et Al 2020
“Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, 2020-02-05 ( ; backlinks; similar)
“Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions”, 2020
“Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions”, 2020-02-01 ( ; backlinks; similar)
“Reducing Non-Normative Text Generation from Language Models”, Et Al 2020
“Reducing Non-Normative Text Generation from Language Models”, 2020-01-23 (backlinks; similar)
“Scaling Laws for Neural Language Models”, Et Al 2020
“Scaling Laws for Neural Language Models”, 2020-01-23 ( ; similar; bibliography)
“What Does BERT Dream Of? A Visual Investigation of Nightmares in Sesame Street”, 2020
“What does BERT dream of? A visual investigation of nightmares in Sesame Street”, 2020-01-13 ( ; backlinks; similar)
“Reformer: The Efficient Transformer”, Et Al 2020
“Reformer: The Efficient Transformer”, 2020-01-13 ( ; similar; bibliography)
“Generative Language Modeling for Automated Theorem Proving § Experiments”, 2020 (page 11 Org Openai)
“Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric”, 2020
“Controlling Text Generation With Plug and Play Language Models”, Et Al 2019
“Controlling Text Generation with Plug and Play Language Models”, 2019-12-05 ( ; backlinks; similar; bibliography)
“Plug and Play Language Models: A Simple Approach to Controlled Text Generation”, Et Al 2019
“Plug and Play Language Models: A Simple Approach to Controlled Text Generation”, 2019-12-04 ( ; backlinks; similar)
“AI Dungeon 2”, 2019
“AI Dungeon 2”, 2019-12 ( ; backlinks; similar; bibliography)
“How Can We Know What Language Models Know?”, Et Al 2019
“How Can We Know What Language Models Know?”, 2019-11-28 ( ; backlinks; similar)
“GPT-2: 1.5B Release”, Et Al 2019
“GPT-2: 1.5B Release”, 2019-11-05 (backlinks; similar)
“Release Strategies and the Social Impacts of Language Models”, Et Al 2019
“Release Strategies and the Social Impacts of Language Models”, 2019-11-05 ( ; similar)
“DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation”, Et Al 2019
“DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation”, 2019-11-01 (similar)
“GPT-2 Folk Music”, 2019
“GPT-2 Folk Music”, 2019-11-01 ( ; backlinks; similar; bibliography)
“Fine-Tuning GPT-2 from Human Preferences § Bugs Can Optimize for Bad Behavior”, Et Al 2019
“Fine-Tuning GPT-2 from Human Preferences § Bugs can optimize for bad behavior”, 2019-09-19 ( ; similar)
“Fine-Tuning GPT-2 from Human Preferences”, Et Al 2019
“Fine-Tuning GPT-2 from Human Preferences”, 2019-09-19 ( ; backlinks; similar)
“Fine-Tuning Language Models from Human Preferences”, Et Al 2019
“Fine-Tuning Language Models from Human Preferences”, 2019-09-18 ( ; similar)
“Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism”, Et Al 2019
“Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism”, 2019-09-17 (similar; bibliography)
“Lm-human-preferences”, Et Al 2019
“lm-human-preferences”, 2019-09-14 ( ; backlinks; similar)
“CTRL: A Conditional Transformer Language Model For Controllable Generation”, Et Al 2019
“CTRL: A Conditional Transformer Language Model For Controllable Generation”, 2019-09-11 ( ; similar; bibliography)
“How To Make Custom AI-Generated Text With GPT-2”, 2019
“How To Make Custom AI-Generated Text With GPT-2”, 2019-09-04 (backlinks; similar; bibliography)
“Language Modelling State-of-the-art Leaderboards”, Paperswithcode.com 2019
“Language Modelling State-of-the-art leaderboards”, 2019-08-28 ( ; backlinks)
“Smaller, Faster, Cheaper, Lighter: Introducing DistilGPT, a Distilled Version of GPT”, 2019
“Smaller, faster, cheaper, lighter: Introducing DistilGPT, a distilled version of GPT”, 2019-08-28 ( ; similar)
“OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too”, 2019
“OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too”, 2019-08-22 (backlinks; similar; bibliography)
“GPT-2: 6-Month Follow-Up”, OpenAI 2019
“GPT-2: 6-Month Follow-Up”, 2019-08-20 (backlinks; similar)
“Universal Adversarial Triggers for Attacking and Analyzing NLP”, Et Al 2019
“Universal Adversarial Triggers for Attacking and Analyzing NLP”, 2019-08-20 ( ; similar)
“MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism”, ADLR 2019
“MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism”, 2019-08-13 ( ; backlinks; similar; bibliography)
“Neural Text Generation With Unlikelihood Training”, Et Al 2019
“Neural Text Generation with Unlikelihood Training”, 2019-08-12 ( ; backlinks; similar)
“Addendum: Evaluation of My Model”, 2019
“Addendum: Evaluation of My Model”, 2019-06-12 (backlinks; similar)
“Replicating GPT-2-1.5B”, 2019
“Replicating GPT-2-1.5B”, 2019-06-06 (backlinks; similar; bibliography)
“GROVER: Defending Against Neural Fake News”, Et Al 2019
“GROVER: Defending Against Neural Fake News”, 2019-05-29 ( ; similar)
“MuseNet: a Deep Neural Network That Can Generate 4-minute Musical Compositions With 10 Different Instruments, and Can Combine Styles from Country to Mozart to the Beatles”, 2019
“MuseNet: a deep neural network that can generate 4-minute musical compositions with 10 different instruments, and can combine styles from country to Mozart to the Beatles”, 2019-04-25 ( ; backlinks; similar; bibliography)
“Generative Modeling With Sparse Transformers: We’ve Developed the Sparse Transformer, a Deep Neural Network Which Sets New Records at Predicting What Comes next in a Sequence—whether Text, Images, or Sound. It Uses an Algorithmic Improvement of The Attention Mechanism to Extract Patterns from Sequences 30× Longer Than Possible Previously”, 2019
“Generative Modeling with Sparse Transformers: We’ve developed the Sparse Transformer, a deep neural network which sets new records at predicting what comes next in a sequence—whether text, images, or sound. It uses an algorithmic improvement of the attention mechanism to extract patterns from sequences 30× longer than possible previously”, 2019-04-23 ( ; backlinks; similar)
“The Curious Case of Neural Text Degeneration”, Et Al 2019
“The Curious Case of Neural Text Degeneration”, 2019-04-22 ( ; similar)
“Smart Vet: Autocompleting Sentences in Veterinary Medical Records”, 2019
“Smart Vet: Autocompleting Sentences in Veterinary Medical Records”, 2019-03-19 (backlinks; similar)
“LM Explorer (alpha)”, 2019
“LM Explorer (alpha)”, 2019-02-26 (backlinks)
“GPT-2 As Step Toward General Intelligence”, 2019
“GPT-2 As Step Toward General Intelligence”, 2019-02-19 (backlinks; similar)
“Better Language Models and Their Implications”, Et Al 2019
“Better Language Models and Their Implications”, 2019-02-14 ( ; backlinks; similar; bibliography)
“Language Models Are Unsupervised Multitask Learners”, Et Al 2019
“Language Models are Unsupervised Multitask Learners”, 2019-02-14 ( ; similar)
“Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, Et Al 2019
“Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, 2019-01-09 ( ; backlinks; similar)
“Talk To Transformer”, 2019
“Talk To Transformer”, 2019
“Music Transformer: Generating Music With Long-Term Structure”, Et Al 2018
“Music Transformer: Generating Music with Long-Term Structure”, 2018-12-13 ( ; backlinks; similar; bibliography)
“Universal Transformers”, Et Al 2018
“Universal Transformers”, 2018-07-10 ( ; similar)
“Adversarial Reprogramming of Neural Networks”, Et Al 2018
“Adversarial Reprogramming of Neural Networks”, 2018-06-28 ( ; backlinks; similar)
“GPT-1: Improving Language Understanding With Unsupervised Learning”, OpenAI 2018
“GPT-1: Improving Language Understanding with Unsupervised Learning”, 2018-06-11 ( ; backlinks; similar)
“GPT-1: Improving Language Understanding by Generative Pre-Training § Model Specifications”, Et Al 2018 (page 5)
“GPT-1: Improving Language Understanding by Generative Pre-Training § Model specifications”, 2018-06-08 ( ; similar; bibliography)
“GPT-1: Improving Language Understanding by Generative Pre-Training”, Et Al 2018
“GPT-1: Improving Language Understanding by Generative Pre-Training”, 2018-06-08 ( ; backlinks; similar)
“Deep Reinforcement Learning from Human Preferences § Appendix A.2: Atari”, Et Al 2017 (page 15 Org Openai)
“Deep reinforcement learning from human preferences § Appendix A.2: Atari”, 2017-06-12
“Learning to Generate Reviews and Discovering Sentiment”, Et Al 2017
“Learning to Generate Reviews and Discovering Sentiment”, 2017-04-05 ( ; similar)
“Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-part-ii-dc585af054cb”
“Https://medium.com/artists-and-machine-intelligence/adventures-in-narrated-reality-6516ff395ba3”
“Computing Machinery And Intelligence”, 1950
“Math: OpenAI API Can Do Some Math out of the Gate, but Most Math It Seems It Has to Learn. Many Times, the Numbers That It Spits out Are Just Random. However, including Different Priming Prompts Can Result in Decent Results.”
“Design a Role-playing Game Using 200 Words or Less.”
“AI Dungeon: Dragon Model Upgrade—You Can Now Play AI Dungeon With One of the Most Powerful AI Models in the World.”
“Controlling GPT-3 With Logit Bias: How AI Dungeon Uses Logit Bias to Help Control GPT-3”
“Looking for Grammar in All the Right Places”
“The AI Channels Project”
“OpenAI API Alchemy: Summarization”
“OpenAI API Alchemy: Emoji Storytelling 🤖”
“OpenAI API Alchemy: Turn a Script into a Novel (and vice Versa)”
“AI Am I? (The New Aesthetic)”
“GPT-3: An AI That’s Eerily Good at Writing Almost Anything”
“Elon Musk By Dr. Seuss (GPT-3)”
“Teaching GPT-3 to Identify Nonsense”
“Transformers As Variational Autoencoders”
“Deep Learning for Assisting the Process of Music Composition (part 3)”
“Using GPT-3 to Explain Jokes”
“I’ve Been Testing the Largest of @OpenAI’s Models With AI Dungeon and Been Constantly Impressed at How Interesting and Dynamic the Characters Are, like This Queen, Long Thought to Be Dead, Hiding from Enemies and Not Happy about Me Prying into Her Personal Life.”
“Homepage of Paul F. Christiano”, 2023
“Homepage of Paul F. Christiano”, ( ; backlinks; similar; bibliography)
“TensorFlow Research Cloud (TRC): Accelerate Your Cutting-edge Machine Learning Research With Free Cloud TPUs”, TRC 2023
“TensorFlow Research Cloud (TRC): Accelerate your cutting-edge machine learning research with free Cloud TPUs”, ( ; backlinks; similar)
“Meditations on Moloch”
“Humans Who Are Not Concentrating Are Not General Intelligences”
“This Is the OpenAI API. It Makes Spookily Good Twitter Bots. 13⁄10 Would Retweet”
“AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
“Interpreting GPT: the Logit Lens”
“A Robot Wrote This Entire Article. Are You Scared Yet, Human? We Asked GPT-3, OpenAI’s Powerful New Language Generator, to Write an Essay for Us from Scratch. The Assignment? To Convince Us Robots Come in Peace | For More about GPT-3 and How This Essay Was Written and Edited, Please Read Our Editor’s Note Below”
“A robot wrote this entire article. Are you scared yet, human? We asked GPT-3, OpenAI’s powerful new language generator, to write an essay for us from scratch. The assignment? To convince us robots come in peace | For more about GPT-3 and how this essay was written and edited, please read our editor’s note below” (backlinks)
Miscellaneous
-
2020-04-18-gpt2-117m-midi-samples.txt
2020-04-18 ( ; backlinks) -
2019-12-22-gpt2-preferencelearning-gwern-abcmusic.patch
2019-12-22 ( ; backlinks) -
2019-12-18-gpt21.5b-poetry-samples-topp080.txt
2019-12-18 ( ; backlinks) -
2019-12-16-gpt21.5b-poetry-samples-topp080.txt
2019-12-16 ( ; backlinks) -
2019-12-15-gpt21.5b-poetry-samples-topp090.txt
2019-12-15 ( ; backlinks) -
2019-12-13-gpt21.5b-poetry-samples-topp090.txt
2019-12-13 ( ; backlinks) -
2019-12-09-gpt2-abccombined-samples-top_p0.95.txt
2019-12-09 ( ; backlinks) -
2019-12-04-gpt2-abc-alldata.tar.xz
2019-12-04 ( ; backlinks) -
2019-11-09-ynbollanbane.txt
2019-11-09 ( ; backlinks) -
2019-11-09-gpt2-nospaces-samples.txt
2019-11-09 ( ; backlinks) -
2019-11-09-gpt2-nospaces-samples-top_p0.99.txt
2019-11-09 ( ; backlinks) -
2019-11-08-gpt2-nospaces-samples.txt
2019-11-08 ( ; backlinks) -
2019-10-23-gwern-gpt2-folkrnn-irishmusic-samples.txt
2019-10-23 ( ; backlinks) -
2019-10-19-117m-poetryfoundation-samples.txt
2019-10-19 ( ; backlinks) -
2019-07-22-gpt2-345m-taotehching-all-ch181.tar.xz
2019-07-22 ( ; backlinks) -
2019-07-21-taotehching-all-1ksamples.txt
2019-07-21 ( ; backlinks) -
2019-07-19-taotehching-ch1-1ksamples.txt
2019-07-19 ( ; backlinks) -
2019-05-13-gpt2-poetry-345m-5000samples.txt
2019-05-13 ( ; backlinks) -
2019-03-06-gpt2-poetry-prefix-1000samples.txt
2019-03-06 ( ; backlinks) -
2019-03-06-gpt2-poetry-1000samples.txt
2019-03-06 ( ; backlinks) -
2016-03-27-rnn-metadata-samples.txt
2016-03-27 ( ; backlinks) -
2016-03-27-rnn-metadata-samples-all.txt
2016-03-27 ( ; backlinks) -
2015-06-03-karpathy-charrnn-visualization.tar.xz
2015-06-03 ( ; backlinks) -
“
/idea
”, (backlinks; bibliography) -
http://nautil.us/issue/52/the-hive/your-next-new-best-friend-might-be-a-robot-rp
-
https://analyticsindiamag.com/when-chatgpt-attempted-upsc-exam/
-
https://nitter.moomoo.me/woj_zaremba/status/1191773448999034880
-
https://old.reddit.com/r/GPT3/comments/tgud2t/my_new_favorite_thing_is_making_gpt3_create/
-
https://old.reddit.com/r/MachineLearning/comments/v42pej/p_this_is_the_worst_ai_ever_gpt4chan_model/
-
https://www.forefront.ai/blog-posts/how-to-fine-tune-gpt-neox
-
https://www.lesswrong.com/posts/EzuBSASuui5qekhLA/assessing-alephalphas-multimodal-model
-
https://www.lesswrong.com/posts/PDLfpRwSynu73mxGw/basic-facts-about-language-model-internals-1
-
https://www.lesswrong.com/posts/yZb5eFvDoaqB337X5/investigating-causal-understanding-in-llms
-
https://www.lesswrong.com/posts/ydeaHqDPJ5REJWvat/a-one-question-turing-test-for-gpt-3
-
https://www.sfchronicle.com/projects/2021/jessica-simulation-artificial-intelligence/
Link Bibliography
-
https://nolanoorg.substack.com/p/int-4-llama-is-not-enough-int-3-and
: “Int-4 LLaMa Is Not Enough—Int-3 and Beyond: More Compression, Easier to Build Apps on LLMs That Run Locally”, nolano.org: -
https://osf.io/5uxra/
: “Beyond the Pass Mark: the Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan”, Yuki Kataoka: -
https://arxiv.org/abs/2302.05981
: “MarioGPT: Open-Ended Text2Level Generation through Large Language Models”, Shyam Sudhakaran, Miguel González-Duque, Claire Glanois, Matthias Freiberger, Elias Najarro, Sebastian Risi: -
https://arxiv.org/abs/2302.06476
: “Is ChatGPT a General-Purpose Natural Language Processing Task Solver?”, Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, Diyi Yang: -
https://www.forbes.com/sites/alexkonrad/2023/02/03/exclusive-openai-sam-altman-chatgpt-agi-google-search/
: “OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’”, Alex Konrad, Kenrick Cai: -
https://arxiv.org/abs/2302.00560
: “Co-Writing With Opinionated Language Models Affects Users’ Views”, Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman: -
https://arxiv.org/abs/2301.04408
: “GPT-3 As Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities”, Jillian Bommarito, Michael Bommarito, Daniel Martin Katz, Jessica Katz: -
https://arxiv.org/abs/2212.14402
: “GPT-3 Takes the Bar Exam”, Michael Bommarito II, Daniel Martin Katz: -
https://www.nytimes.com/2022/12/21/technology/ai-chatgpt-google-search.html
: “A New Chat Bot Is a ‘Code Red’ for Google’s Search Business: A New Wave of Chat Bots like ChatGPT Use Artificial Intelligence That Could Reinvent or Even Replace the Traditional Internet Search Engine”, Nico Grant, Cade Metz: -
https://arxiv.org/abs/2212.10496
: “Precise Zero-Shot Dense Retrieval without Relevance Labels”, Luyu Gao, Xueguang Ma, Jimmy Lin, Jamie Callan: -
https://techcrunch.com/2022/11/23/harvey-which-uses-ai-to-answer-legal-questions-lands-cash-from-openai/
: “Harvey, Which Uses AI to Answer Legal Questions, Lands Cash from OpenAI”, Kyle Wiggers: -
https://arxiv.org/abs/2211.10438
: “SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models”, Guangxuan Xiao, Ji Lin, Mickael Seznec, Julien Demouth, Song Han: -
https://arxiv.org/abs/2211.09800
: “InstructPix2Pix: Learning to Follow Image Editing Instructions”, Tim Brooks, Aleksander Holynski, Alexei A. Efros: -
https://arxiv.org/abs/2210.17323
: “GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers”, Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh: -
https://arxiv.org/abs/2210.14140
: “Contrastive Search Is What You Need For Neural Text Generation”, Yixuan Su, Nigel Collier: -
https://arxiv.org/abs/2210.13673#nvidia
: “Evaluating Parameter Efficient Learning for Generation”, : -
https://arxiv.org/abs/2210.10341#microsoft
: “BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining”, Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu: -
https://arxiv.org/abs/2210.15458#google
: “Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models”, Luke Vilnis, Yury Zemlyanskiy, Patrick Murray, Alexandre Passos, Sumit Sanghai: -
https://arxiv.org/abs/2210.06423#microsoft
: “Foundation Transformers”, : -
https://arxiv.org/abs/2210.02441
: “Ask Me Anything (AMA): A Simple Strategy for Prompting Language Models”, : -
https://www.nature.com/articles/s41598-022-20460-9
: “Deep Language Algorithms Predict Semantic Comprehension from Brain Activity”, Charlotte Caucheteux, Alexandre Gramfort, Jean-Rémi King: -
https://arxiv.org/abs/2209.03320
: “What Does a Platypus Look Like? Generating Customized Prompts for Zero-shot Image Classification (CuPL)”, Sarah Pratt, Rosanne Liu, Ali Farhadi: -
https://www.anthropic.com/red_teaming.pdf
: “Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned”, : -
https://arxiv.org/abs/2208.01066
: “What Can Transformers Learn In-Context? A Case Study of Simple Function Classes”, Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant: -
https://arxiv.org/abs/2207.04429
: “LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine: -
https://arxiv.org/abs/2206.01861#microsoft
: “ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers”, Zhewei Yao, Reza Yazdani Aminabadi, Minjia Zhang, Xiaoxia Wu, Conglong Li, Yuxiong He: -
https://arxiv.org/abs/2205.14135
: “FlashAttention: Fast and Memory-Efficient Exact Attention With IO-Awareness”, Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré: -
https://arxiv.org/abs/2205.12910#allen
: “NaturalProver: Grounded Mathematical Proof Generation With Language Models”, Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi: -
https://www.nature.com/articles/s41593-022-01026-4
: “Shared Computational Principles for Language Processing in Humans and Deep Language Models”, : -
https://arxiv.org/abs/2110.04627#google
: “Vector-quantized Image Modeling With Improved VQGAN”, : -
2022-liu.pdf
: “Quantifying and Alleviating Political Bias in Language Models”, Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Soroush Vosoughi: -
https://arxiv.org/abs/2202.12837#facebook
: “Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer: -
https://www.nature.com/articles/s42003-022-03036-1
: “Brains and Algorithms Partially Converge in Natural Language Processing”, Charlotte Caucheteux, Jean-Rémi King: -
https://arxiv.org/abs/2201.11990#microsoftnvidia
: “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model”, : -
https://swabhs.com/assets/pdf/wanli.pdf#allen
: “WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi: -
https://arxiv.org/abs/2201.05320#allen
: “CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”, Alon Talmor, Ori Yoran, Ronan Le Bras, Chandra Bhagavatula, Yoav Goldberg, Yejin Choi, Jonathan Berant: -
2022-tu.pdf
: “Limits of Using Artificial Intelligence and GPT-3 in Patent Prosecution”, Sean Tu, Amy Cyphert, Sam Perl: -
https://arxiv.org/abs/2112.04426#deepmind
: “Improving Language Models by Retrieving from Trillions of Tokens”, : -
https://arxiv.org/abs/2112.00861#anthropic
: “A General Language Assistant As a Laboratory for Alignment”, : -
https://openreview.net/forum?id=gJcEM8sxHK
: “Mapping Language Models to Grounded Conceptual Spaces”, Roma Patel, Ellie Pavlick: -
https://arxiv.org/abs/2111.09734
: “ClipCap: CLIP Prefix for Image Captioning”, Ron Mokady, Amir Hertz, Amit H. Bermano: -
https://aclanthology.org/2021.mrqa-1.7.pdf
: “What Can a Generative Language Model Answer About a Passage?”, Douglas Summers-Stay, Claire Bonial, Clare Voss: -
https://arxiv.org/abs/2111.02570#microsoft
: “CLUES: Few-Shot Learning Evaluation in Natural Language Understanding”, : -
https://arxiv.org/abs/2110.11309
: “Fast Model Editing at Scale”, Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning: -
https://arxiv.org/abs/2109.07958
: “TruthfulQA: Measuring How Models Mimic Human Falsehoods”, Stephanie Lin, Jacob Hilton, Owain Evans: -
https://arxiv.org/abs/2109.02593#allen
: “General-Purpose Question-Answering With Macaw”, Oyvind Tafjord, Peter Clark: -
https://arxiv.org/abs/2107.01294#allen
: “Scarecrow: A Framework for Scrutinizing Machine Text”, Yao Dou, Maxwell Forbes, Rik Koncel-Kedziorski, Noah A. Smith, Yejin Choi: -
https://arxiv.org/abs/2106.09685
: “LoRA: Low-Rank Adaptation of Large Language Models”, Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen: -
https://psyarxiv.com/m6s28/
: “Let the Algorithm Speak: How to Use Neural Networks for Automatic Item Generation in Psychological Scale Development”, Friedrich Götz, Rakoen Maertens, Sander van der Linden: -
https://arxiv.org/abs/2106.06981
: “RASP: Thinking Like Transformers”, Gail Weiss, Yoav Goldberg, Eran Yahav: -
https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/
: “GPT-J-6B: 6B JAX-Based Transformer”, EleutherAI: -
https://arxiv.org/abs/2106.00958#openai
: “LHOPT: A Generalizable Approach to Learning Optimizers”, Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba: -
https://arxiv.org/abs/2105.13626#google
: “ByT5: Towards a Token-free Future With Pre-trained Byte-to-byte Models”, Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel: -
http://m.koreaherald.com/view.php?ud=20210525000824#naver
: “Naver Unveils First ‘hyperscale’ AI Platform”, Kang Jae-eun: -
scaling
: “Machine Learning Scaling”, Gwern Branwen: -
https://arxiv.org/abs/2102.13019
: “Investigating the Limitations of the Transformers With Simple Arithmetic Tasks”, Rodrigo Nogueira, Zhiying Jiang, Jimmy Li: -
https://arxiv.org/abs/2101.00190
: “Prefix-Tuning: Optimizing Continuous Prompts for Generation”, Xiang Lisa Li, Percy Liang: -
https://arxiv.org/abs/2101.00027#eleutherai
: “The Pile: An 800GB Dataset of Diverse Text for Language Modeling”, : -
https://aclanthology.org/2021.naacl-main.235.pdf#facebook
: “Bot-Adversarial Dialogue for Safe Conversational Agents”, Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan: -
https://arxiv.org/abs/2010.14701#openai
: “Scaling Laws for Autoregressive Generative Modeling”, : -
https://arxiv.org/abs/2009.03393#openai
: “Generative Language Modeling for Automated Theorem Proving”, Stanislas Polu, Ilya Sutskever: -
https://arxiv.org/abs/2009.03300
: “MMLU: Measuring Massive Multitask Language Understanding”, Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt: -
attention
: “Efficient Attention: Breaking The Quadratic Transformer Bottleneck”, Gwern Branwen: -
scaling-hypothesis
: “The Scaling Hypothesis”, Gwern Branwen: -
https://arxiv.org/abs/2004.10802
: “Scaling Laws from the Data Manifold Dimension”, Utkarsh Sharma, Jared Kaplan: -
https://www.newsweek.com/openai-text-generator-gpt-2-video-game-walkthrough-most-tedious-1488334
: “OpenAI Text Generator GPT-2 Creates Video Game Walkthrough for 'Most Tedious Game in History'”, Andrew Whalen: -
https://arxiv.org/abs/2001.08361#openai
: “Scaling Laws for Neural Language Models”, : -
https://arxiv.org/abs/2001.04451#google
: “Reformer: The Efficient Transformer”, Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya: -
https://eng.uber.com/pplm/
: “Controlling Text Generation With Plug and Play Language Models”, Rosanne Liu, Sumanth Dathathri, Andrea Madotto, Piero Molino, Jason Yosinski: -
https://play.aidungeon.io/main/home
: “AI Dungeon 2”, Nick Walton: -
gpt-2-music
: “GPT-2 Folk Music”, Gwern Branwen, Shawn Presser: -
https://arxiv.org/abs/1909.08053#nvidia
: “Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism”, Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro: -
https://arxiv.org/abs/1909.05858#salesforce
: “CTRL: A Conditional Transformer Language Model For Controllable Generation”, Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, Richard Socher (Salesforce): -
https://minimaxir.com/2019/09/howto-gpt2/
: “How To Make Custom AI-Generated Text With GPT-2”, Max Woolf: -
https://medium.com/@vanya_cohen/opengpt-2-we-replicated-gpt-2-because-you-can-too-45e34e6d36dc
: “OpenGPT-2: We Replicated GPT-2-1.5b Because You Can Too”, Aaron Gokaslan, Vanya Cohen: -
https://nv-adlr.github.io/MegatronLM
: “MegatronLM: Training Billion+ Parameter Language Models Using GPU Model Parallelism”, NVIDIA ADLR: -
https://medium.com/@NPCollapse/replicating-gpt2-1-5b-86454a7f26af
: “Replicating GPT-2-1.5B”, Connor Leahy: -
https://openai.com/blog/musenet/
: “MuseNet: a Deep Neural Network That Can Generate 4-minute Musical Compositions With 10 Different Instruments, and Can Combine Styles from Country to Mozart to the Beatles”, Christine Payne: -
https://openai.com/blog/better-language-models/
: “Better Language Models and Their Implications”, Alec Radford, Jeffrey Wu, Dario Amodei, Daniela Amodei, Jack Clark, Miles Brundage, Ilya Sutskever: -
https://magenta.tensorflow.org/music-transformer
: “Music Transformer: Generating Music With Long-Term Structure”, Cheng-Zhi Anna Huang, Ian Simon, Monica Dinculescu: -
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf#page=5
: “GPT-1: Improving Language Understanding by Generative Pre-Training § Model Specifications”, Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever: -
https://paulfchristiano.com/
: “Homepage of Paul F. Christiano”, Paul F. Christiano: