“‘GPT’ Tag”,2019-12-13 (; backlinks):
![]()
Bibliography for tag
ai/nn/transformer/gpt, most recent first: 34 related tags, 299 annotations, & 195 links (parent).
- See Also
- Gwern
- Links
- “Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?”, et al 2024
- “Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?”, et al 2024
- “Model Equality Testing: Which Model Is This API Serving?”, et al 2024
- “Centaur: a Foundation Model of Human Cognition”, et al 2024
- “Do LLMs Estimate Uncertainty Well in Instruction-Following?”, et al 2024
- “Interpretable Contrastive Monte Carlo Tree Search Reasoning”, et al 2024
- “NGPT: Normalized Transformer With Representation Learning on the Hypersphere”, et al 2024
- “LLM Applications I Want To See”, 2024
- “Token Erasure As a Footprint of Implicit Vocabulary Items in LLMs”, et al 2024
- “Resolving Discrepancies in Compute-Optimal Scaling of Language Models”, et al 2024
- “When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models”, et al 2024
- “Nemotron-4 340B Technical Report”, et al 2024
- “DataComp-LM: In Search of the next Generation of Training Sets for Language Models”, et al 2024
- “How Do Large Language Models Acquire Factual Knowledge During Pretraining?”, et al 2024
- “Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs”, et al 2024
- “Discovering Preference Optimization Algorithms With and for Large Language Models”, et al 2024
- “MCTSr: Accessing GPT-4 Level Mathematical Olympiad Solutions via Monte Carlo Tree Self-Refine With LLaMA-3-8B”, et al 2024
- “For Chinese Students, the New Tactic Against AI Checks: More AI”, 2024
- “MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series”, et al 2024
- “Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass”, et al 2024
- “SpaceByte: Towards Deleting Tokenization from Large Language Modeling”, 2024
- “Towards Smaller, Faster Decoder-Only Transformers: Architectural Variants and Their Implications”, Suresh & P 2024
- “Design of Highly Functional Genome Editors by Modeling the Universe of CRISPR-Cas Sequences”, et al 2024
- “From r to Q✱: Your Language Model Is Secretly a Q-Function”, et al 2024
- “CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models”, et al 2024
- “CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs’ (Lack Of) Multicultural Knowledge”, et al 2024
- “Training LLMs over Neurally Compressed Text”, et al 2024
- “Reverse Training to Nurse the Reversal Curse”, et al 2024
- “Evolutionary Optimization of Model Merging Recipes”, et al 2024
- “Yi: Open Foundation Models by 01.AI”, et al 2024
- “Actions Speak Louder Than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations (HSTU)”, et al 2024
- “Fast Adversarial Attacks on Language Models In One GPU Minute”, et al 2024
- “Autonomous Data Selection With Language Models for Mathematical Texts”, et al 2024
- “Grandmaster-Level Chess Without Search”, et al 2024
- “Neural Networks Learn Statistics of Increasing Complexity”, et al 2024
- “Arrows of Time for Large Language Models”, et al 2024
- “SliceGPT: Compress Large Language Models by Deleting Rows and Columns”, et al 2024
- “Excuse Me, Sir? Your Language Model Is Leaking (information)”, 2024
- “TinyLlama: An Open-Source Small Language Model”, et al 2024
- “LLaMA Pro: Progressive LLaMA With Block Expansion”, et al 2024
- “Generative AI Is Already Widespread in the Public Sector”, et al 2024
- “Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws”, 2023
- “TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones”, et al 2023
- “Reasons to Reject? Aligning Language Models With Judgments”, et al 2023
- “Generative Multimodal Models Are In-Context Learners”, et al 2023
- “Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning”, et al 2023
- “Object Recognition As Next Token Prediction”, et al 2023
- “MEDITRON-70B: Scaling Medical Pretraining for Large Language Models”, et al 2023
- “Localizing Lying in Llama: Understanding Instructed Dishonesty on True-False Questions Through Prompting, Probing, and Patching”, et al 2023
- “OpenAI Researchers Warned Board of AI Breakthrough ahead of CEO Ouster, Sources Say”, et al 2023
- “Positional Description Matters for Transformers Arithmetic”, et al 2023
- “Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models”, et al 2023
- “Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game”, et al 2023
- “Learn Your Tokens: Word-Pooled Tokenization for Language Modeling”, et al 2023
- “Llemma: An Open Language Model For Mathematics”, et al 2023
- “In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries”, et al 2023
- “OSD: Online Speculative Decoding”, et al 2023
- “Let Models Speak Ciphers: Multiagent Debate through Embeddings”, et al 2023
- “OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text”, et al 2023
- “XVal: A Continuous Number Encoding for Large Language Models”, et al 2023
- “MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models”, et al 2023
- “Language Modeling Is Compression”, et al 2023
- “Sparse Autoencoders Find Highly Interpretable Features in Language Models”, et al 2023
- “Anchor Points: Benchmarking Models With Much Fewer Examples”, et al 2023
- “When Less Is More: Investigating Data Pruning for Pretraining LLMs at Scale”, et al 2023
- “Language Reward Modulation for Pretraining Reinforcement Learning”, et al 2023
- “ReST: Reinforced Self-Training (ReST) for Language Modeling”, et al 2023
- “Studying Large Language Model Generalization With Influence Functions”, et al 2023
- “Multimodal Neurons in Pretrained Text-Only Transformers”, et al 2023
- “Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models”, et al 2023
- “Length Generalization in Arithmetic Transformers”, et al 2023
- “Are Aligned Neural Networks Adversarially Aligned?”, et al 2023
- “Improving Long-Horizon Imitation Through Instruction Prediction”, et al 2023
- “Large Language Models Sometimes Generate Purely Negatively-Reinforced Text”, 2023
- “SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression”, et al 2023
- “Undetectable Watermarks for Language Models”, et al 2023
- “Improving Language Models With Advantage-Based Offline Policy Gradients”, et al 2023
- “Accelerating Transformer Inference for Translation via Parallel Decoding”, et al 2023
- “DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining”, et al 2023
- “Memorization for Good: Encryption With Autoregressive Language Models”, 2023
- “MEGABYTE: Predicting Million-Byte Sequences With Multiscale Transformers”, et al 2023
- “Finding Neurons in a Haystack: Case Studies With Sparse Probing”, et al 2023
- “Inflection AI, Startup From Ex-DeepMind Leaders, Launches Pi—A Chattier Chatbot”, 2023
- “Emergent and Predictable Memorization in Large Language Models”, et al 2023
- “A Comparative Study between Full-Parameter and LoRA-Based Fine-Tuning on Chinese Instruction Data for Instruction Following Large Language Model”, et al 2023
- “Shall We Pretrain Autoregressive Language Models With Retrieval? A Comprehensive Study”, et al 2023
- “How Large-Language Models Can Revolutionize Military Planning”, 2023
- “Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling”, et al 2023
- “8 Things to Know about Large Language Models”, 2023
- “BloombergGPT: A Large Language Model for Finance”, et al 2023
- “The Quantization Model of Neural Scaling”, et al 2023
- “Int-4 LLaMa Is Not Enough—Int-3 and Beyond: More Compression, Easier to Build Apps on LLMs That Run Locally”, nolano.org 2023
- “Consistency Analysis of ChatGPT”, 2023
- “Rewarding Chatbots for Real-World Engagement With Millions of Users”, et al 2023
- “Beyond the Pass Mark: the Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan”, 2023
- “SpikeGPT: Generative Pre-Trained Language Model With Spiking Neural Networks”, et al 2023
- “A Prompt Pattern Catalog to Enhance Prompt Engineering With ChatGPT”, et al 2023
- “BiLD: Big Little Transformer Decoder”, et al 2023
- “Data Selection for Language Models via Importance Resampling”, et al 2023
- “In-Context Retrieval-Augmented Language Models”, et al 2023
- “Crawling the Internal Knowledge-Base of Language Models”, et al 2023
- “Big Tech Was Moving Cautiously on AI. Then Came ChatGPT. Google, Facebook and Microsoft Helped Build the Scaffolding of AI. Smaller Companies Are Taking It to the Masses, Forcing Big Tech to React”, et al 2023
- “Rock Guitar Tablature Generation via Natural Language Processing”, Casco-2023
- “InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers”, et al 2023
- “A New Chat Bot Is a ‘Code Red’ for Google’s Search Business: A New Wave of Chat Bots like ChatGPT Use Artificial Intelligence That Could Reinvent or Even Replace the Traditional Internet Search Engine”, 2022
- “Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent As Meta-Optimizers”, et al 2022
- “Rethinking the Role of Scale for In-Context Learning: An Interpretability-Based Case Study at 66 Billion Scale”, et al 2022
- “Interpreting Neural Networks through the Polytope Lens”, et al 2022
- “SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models”, et al 2022
- “InstructPix2Pix: Learning to Follow Image Editing Instructions”, et al 2022
- “Galactica: A Large Language Model for Science”, et al 2022
- “Large Language Models Struggle to Learn Long-Tail Knowledge”, et al 2022
- “The CRINGE Loss: Learning What Language Not to Model”, et al 2022
- “Mysteries of Mode Collapse § Inescapable Wedding Parties”, 2022
- “GPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers”, et al 2022
- “What Is My Math Transformer Doing? – 3 Results on Interpretability and Generalization”, 2022
- “When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels”, et al 2022
- “Can Language Models Handle Recursively Nested Grammatical Structures? A Case Study on Comparing Models and Humans”, 2022
- “Evaluating Parameter Efficient Learning for Generation”, et al 2022
- “BioGPT: Generative Pre-Trained Transformer for Biomedical Text Generation and Mining”, et al 2022
- “Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models”, et al 2022
- “MTEB: Massive Text Embedding Benchmark”, et al 2022
- “Foundation Transformers”, et al 2022
- “Ask Me Anything (AMA): A Simple Strategy for Prompting Language Models”, et al 2022
- “Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization”, et al 2022
- “Sparrow: Improving Alignment of Dialogue Agents via Targeted Human Judgements”, et al 2022
- “Generate rather than Retrieve (GenRead): Large Language Models Are Strong Context Generators”, et al 2022
- “FP8 Formats for Deep Learning”, et al 2022
- “Petals: Collaborative Inference and Fine-Tuning of Large Models”, et al 2022
- “
LLM.int8(): 8-Bit Matrix Multiplication for Transformers at Scale”, et al 2022- “Meaning without Reference in Large Language Models”, 2022
- “Effidit: Your AI Writing Assistant”, et al 2022
- “Language Models Show Human-Like Content Effects on Reasoning”, et al 2022
- “LM-Nav: Robotic Navigation With Large Pre-Trained Models of Language, Vision, and Action”, et al 2022
- “Can Foundation Models Talk Causality?”, et al 2022
- “NOAH: Neural Prompt Search”, et al 2022
- “ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers”, et al 2022
- “Quark: Controllable Text Generation With Reinforced Unlearning”, et al 2022
- “RankGen: Improving Text Generation With Large Ranking Models”, et al 2022
- “Opal: Multimodal Image Generation for News Illustration”, et al 2022
- “What Language Model to Train If You Have One Million GPU Hours?”, et al 2022
- “WAVPROMPT: Towards Few-Shot Spoken Language Understanding With Frozen Language Models”, et al 2022
- “Shared Computational Principles for Language Processing in Humans and Deep Language Models”, et al 2022
- “Vector-Quantized Image Modeling With Improved VQGAN”, et al 2022
- “Brains and Algorithms Partially Converge in Natural Language Processing”, 2022
- “Quantifying Memorization Across Neural Language Models”, et al 2022
- “A Contrastive Framework for Neural Text Generation”, et al 2022
- “AdaPrompt: Adaptive Model Training for Prompt-Based NLP”, et al 2022
- “InPars: Data Augmentation for Information Retrieval Using Large Language Models”, et al 2022
- “ROME: Locating and Editing Factual Associations in GPT”, et al 2022
- “Cedille: A Large Autoregressive French Language Model”, 2022
- “Data Scaling Laws in NMT: The Effect of Noise and Architecture”, et al 2022
- “PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, et al 2022
- “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model”, et al 2022
- “Language Models As Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents”, et al 2022
- “WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation”, et al 2022
- “A Survey of Controllable Text Generation Using Transformer-Based Pre-Trained Language Models”, et al 2022
- “The Defeat of the Winograd Schema Challenge”, et al 2022
- “Learning To Retrieve Prompts for In-Context Learning”, et al 2021
- “Learning to Prompt for Continual Learning”, et al 2021
- “Amortized Noisy Channel Neural Machine Translation”, et al 2021
- “Few-Shot Instruction Prompts for Pretrained Language Models to Detect Social Biases”, et al 2021
- “PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts”, et al 2021
- “LMTurk: Few-Shot Learners As Crowdsourcing Workers”, et al 2021
- “Improving Language Models by Retrieving from Trillions of Tokens”, et al 2021
- “Linear Algebra With Transformers”, 2021
- “Zero-Shot Image-To-Text Generation for Visual-Semantic Arithmetic”, et al 2021
- “Long-Range and Hierarchical Language Predictions in Brains and Algorithms”, et al 2021
- “True Few-Shot Learning With Prompts—A Real-World Perspective”, Schick & 2021
- “Few-Shot Named Entity Recognition With Cloze Questions”, et al 2021
- “Evaluating Distributional Distortion in Neural Language Modeling”, 2021
- “On Transferability of Prompt Tuning for Natural Language Understanding”, et al 2021
- “CLUES: Few-Shot Learning Evaluation in Natural Language Understanding”, et al 2021
- “Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey”, et al 2021
- “Fast Model Editing at Scale”, et al 2021
- “Yuan 1.0: Large-Scale Pre-Trained Language Model in Zero-Shot and Few-Shot Learning”, et al 2021
- “Towards a Unified View of Parameter-Efficient Transfer Learning”, et al 2021
- “A Few More Examples May Be Worth Billions of Parameters”, et al 2021
- “Scaling Laws for Neural Machine Translation”, et al 2021
- “Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color”, et al 2021
- “What Changes Can Large-Scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-Scale Korean Generative Pretrained Transformers”, et al 2021
- “Medically Aware GPT-3 As a Data Generator for Medical Dialogue Summarization”, et al 2021
- “General-Purpose Question-Answering With Macaw”, 2021
- “An Empirical Exploration in Quality Filtering of Text Data”, 2021
- “Want To Reduce Labeling Cost? GPT-3 Can Help”, et al 2021
- “Multimodal Few-Shot Learning With Frozen Language Models”, et al 2021
- “Cutting Down on Prompts and Parameters: Simple Few-Shot Learning With Language Models”, IV et al 2021
- “RASP: Thinking Like Transformers”, et al 2021
- “ByT5: Towards a Token-Free Future With Pre-Trained Byte-To-Byte Models”, et al 2021
- “Anthropic Raises $124 Million to Build More Reliable, General AI Systems”, 2021
- “Naver Unveils First ‘Hyperscale’ AI Platform”, 2021
- “Scaling Laws for Language Transfer Learning”, 2021
- “GPT Understands, Too”, et al 2021
- “How Many Data Points Is a Prompt Worth?”, 2021
- “Pretrained Transformers As Universal Computation Engines”, et al 2021
- “Language Models Have a Moral Dimension”, et al 2021
- “Learning Chess Blindfolded: Evaluating Language Models on State Tracking”, et al 2021
- “Investigating the Limitations of the Transformers With Simple Arithmetic Tasks”, et al 2021
- “Proof Artifact Co-Training for Theorem Proving With Language Models”, et al 2021
- “Clinical Outcome Prediction from Admission Notes Using Self-Supervised Knowledge Integration”, et al 2021
- “Scaling Laws for Transfer”, et al 2021
- “MAUVE: Measuring the Gap Between Neural Text and Human Text Using Divergence Frontiers”, et al 2021
- “Apparently ‘What Ho’ Is a Corruption Of…”, 2021
- “Making Pre-Trained Language Models Better Few-Shot Learners”, et al 2020
- “Thinking Ahead: Prediction in Context As a Keystone of Language in Humans and Machines”, et al 2020
- “CPM: A Large-Scale Generative Chinese Pre-Trained Language Model”, et al 2020
- “L2L: Training Large Neural Networks With Constant Memory Using a New Execution Algorithm”, et al 2020
- “Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries”, et al 2020
- “The Neural Architecture of Language: Integrative Reverse-Engineering Converges on a Model for Predictive Processing”, et al 2020
- “RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text”, et al 2020
- “A Systematic Characterization of Sampling Algorithms for Open-Ended Language Generation”, et al 2020
- “Generative Language Modeling for Automated Theorem Proving”, 2020
- “Learning to Summarize from Human Feedback”, et al 2020
- “ETHICS: Aligning AI With Shared Human Values”, et al 2020
- “Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity”, et al 2020
- “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data”, 2020
- “Transformers Are RNNs: Fast Autoregressive Transformers With Linear Attention”, et al 2020
- “OpenAI API Beta Homepage”, OpenAI 2020
- “Trading Off Diversity and Quality in Natural Language Generation”, et al 2020
- “Scaling Laws from the Data Manifold Dimension”, 2020
- “Unigram LM: Byte Pair Encoding Is Suboptimal for Language Model Pretraining”, 2020
- “Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks”, et al 2020
- “Pop Music Transformer: Beat-Based Modeling and Generation of Expressive Pop Piano Compositions”, 2020
- “Scaling Laws for Neural Language Models”, et al 2020
- “Reformer: The Efficient Transformer”, et al 2020
- “What Does BERT Dream Of? A Visual Investigation of Nightmares in Sesame Street”, 2020
- “Generative Language Modeling for Automated Theorem Proving § Experiments”, 2020 (page 11 org openai)
- “Plug and Play Language Models: A Simple Approach to Controlled Text Generation”, et al 2019
- “How Can We Know What Language Models Know?”, et al 2019
- “CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning”, et al 2019
- “Generalization through Memorization: Nearest Neighbor Language Models”, et al 2019
- “DialoGPT: Large-Scale Generative Pre-Training for Conversational Response Generation”, et al 2019
- “CTRL: A Conditional Transformer Language Model For Controllable Generation”, et al 2019
- “Smaller, Faster, Cheaper, Lighter: Introducing DistilGPT, a Distilled Version of GPT”, 2019
- “Language Modeling State-Of-The-Art Leaderboards”, paperswithcode.com 2019
- “Neural Text Generation With Unlikelihood Training”, et al 2019
- “GROVER: Defending Against Neural Fake News”, et al 2019
- “Generative Modeling With Sparse Transformers: We’ve Developed the Sparse Transformer, a Deep Neural Network Which Sets New Records at Predicting What Comes next in a Sequence—Whether Text, Images, or Sound. It Uses an Algorithmic Improvement of the attention Mechanism to Extract Patterns from Sequences 30× Longer Than Possible Previously”, 2019
- “The Curious Case of Neural Text Degeneration”, et al 2019
- “Smart Vet: Autocompleting Sentences in Veterinary Medical Records”, 2019
- “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context”, et al 2019
- “Music Transformer: Generating Music With Long-Term Structure”, et al 2018
- “Universal Transformers”, et al 2018
- “Adversarial Reprogramming of Neural Networks”, et al 2018
- “GPT-1: Improving Language Understanding With Unsupervised Learning”, OpenAI 2018
- “GPT-1: Improving Language Understanding by Generative Pre-Training”, et al 2018
- “GPT-1: Improving Language Understanding by Generative Pre-Training § Model Specifications”, et al 2018 (page 5)
- “Deep Reinforcement Learning from Human Preferences § Appendix A.2: Atari”, et al 2017 (page 15 org openai)
- “Learning to Generate Reviews and Discovering Sentiment”, et al 2017
- “Design a Role-Playing Game Using 200 Words or Less.”
- “How Does In-Context Learning Work? A Framework for Understanding the Differences from Traditional Supervised Learning”
- “AI Dungeon: Dragon Model Upgrade—You Can Now Play AI Dungeon With One of the Most Powerful AI Models in the World.”
- “Introducing AI Dungeon Translate: AI Dungeon Players Can Now Translate Their Stories into Emojis by Just Clicking a Button. [ 🤔 💯 🤷♂️ 🤔 🤔 🤔 💯]”
- “OpenAI API Alchemy: Emoji Storytelling 🤖”
- “Llama-3.1-405B Now Runs at 969 Tokens/s on Cerebras Inference”
- “I Blew $720 on 100 Notebooks from Alibaba and Started a Paper Website Business”
- “AlphaStar: Mastering the Real-Time Strategy Game StarCraft II”
- “Transformers As Variational Autoencoders”
- “BlinkDL/RWKV-LM: RWKV Is an RNN With Transformer-Level LLM Performance. It Can Be Directly Trained like a GPT (parallelizable). So It’s Combining the Best of RNN and Transformer—Great Performance, Fast Inference, Saves VRAM, Fast Training, “Infinite” Ctx_len, and Free Sentence Embedding.”
- “Efficient, Reusable RNNs and LSTMs for Torch”
- “Updated Training?”
- “Karpathy/minGPT: A Minimal PyTorch Re-Implementation of the OpenAI GPT (Generative Pretrained Transformer) Training”
- “Minimaxir/textgenrnn: Easily Train Your Own Text-Generating Neural Network of Any Size and Complexity on Any Text Dataset With a Few Lines of Code.”
- “Loom: Multiversal Tree Writing Interface for Human-AI Collaboration”, 2024
- “Zphang/minimal-Opt”
- “Math: OpenAI API Can Do Some Math out of the Gate, but Most Math It Seems It Has to Learn. Many Times, the Numbers That It Spits out Are Just Random. However, including Different Priming Prompts Can Result in Decent Results.”
- “Deep Learning for Assisting the Process of Music Composition (part 3)”
- “Google DeepMind’s Grandmaster-Level Chess Without Search”
- “The Technology Behind BLOOM Training”
- “Psych-101 Dataset [For Centaur]”
- The Gostak
- “Imprompter”
- “Your Next New Best Friend Might Be a Robot”
- “I Made a Custom Gpt That Incorporates Advertisement/product Placement With Its…”
- “The Annotated Transformer”
- “Homepage of Paul F. Christiano”, 2024
- “Data Exfiltration from Slack AI via Indirect Prompt Injection”, Prompt2024
- “Introductory Antimemetics (abandoned First Draft)”, 2024
- “Jared Kaplan”
- “Meditations on Moloch”
- “Stream Seaandsailor”
- “Humans Who Are Not Concentrating Are Not General Intelligences”
- “Monitor: An AI-Driven Observability Interface”
- “This Is the OpenAI API. It Makes Spookily Good Twitter Bots. 13⁄10 Would Retweet”
- “AMA Conjecture, A New Alignment Startup”
- “WikiCrow”
- “ChatGPT As Muse, Not Oracle”, 2024
- “Interpreting GPT: the Logit Lens”
- “Assessing AlephAlpha’s Multimodal Model”
- “Is GPT-3 a Good Rationalist?”
- “We Are Conjecture, A New Alignment Research Startup”
- “Investigating Causal Understanding in LLMs”
- “A One-Question Turing Test for GPT-3”
- “This Mystical Book Was Co-Authored by a Disturbingly Realistic AI”
- “The Guy Behind the Fake AI Halloween Parade Listing Says You’ve Got It All Wrong”
- “Season 1 Ep. 22 OpenAI’s Ilya Sutskever: The Man Who Made AI Work”
- “WELM”
- nickwalton00
- sama
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography