“‘Inner Monologue (AI)’ Tag”,2019-12-22 (; backlinks):
![]()
Bibliography for tag
ai/nn/transformer/gpt/inner-monologue, most recent first: 6 related tags, 178 annotations, & 110 links (parent).
- See Also
- Gwern
- Links
- “Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models”, et al 2024
- “Mind Your Step (by Step): Chain-Of-Thought Can Reduce Performance on Tasks Where Thinking Makes Humans Worse”, et al 2024
- “Thinking LLMs: General Instruction Following With Thought Generation”, et al 2024
- “When a Language Model Is Optimized for Reasoning, Does It Still Show Embers of Autoregression? An Analysis of OpenAI O1”, et al 2024
- “Evaluation of OpenAI O1: Opportunities and Challenges of AGI”, et al 2024
- “LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of OpenAI’s O1 on PlanBench”, et al 2024
- “Training Language Models to Self-Correct via Reinforcement Learning”, et al 2024
- “To CoT or Not to CoT? Chain-Of-Thought Helps Mainly on Math and Symbolic Reasoning”, et al 2024
- “Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process”, et al 2024
- “Connecting the Dots: LLMs Can Infer and Verbalize Latent Structure from Disparate Training Data”, et al 2024
- “Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?”, et al 2024
- “OlympicArena: Benchmarking Multi-Discipline Cognitive Reasoning for Superintelligent AI”, et al 2024
- “How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad”, et al 2024
- “OmegaPRM: Improve Mathematical Reasoning in Language Models by Automated Process Supervision”, et al 2024
- “MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark”, et al 2024
- “A Theoretical Understanding of Self-Correction through In-Context Alignment”, et al 2024
- “Intelligent Go-Explore (IGE): Standing on the Shoulders of Giant Foundation Models”, et al 2024
- “From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step”, et al 2024
- “Retrieval Head Mechanistically Explains Long-Context Factuality”, et al 2024
- “Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models”, et al 2024
- “Autonomous LLM-Driven Research from Data to Human-Verifiable Research Papers”, et al 2024
- “Missed Connections: Lateral Thinking Puzzles for Large Language Models”, et al 2024
- “ChatGPT Can Predict the Future When It Tells Stories Set in the Future About the Past”, 2024
- “Visualization-Of-Thought Elicits Spatial Reasoning in Large Language Models”, et al 2024
- “Do Language Models Plan Ahead for Future Tokens?”, et al 2024
- “FABLES: Evaluating Faithfulness and Content Selection in Book-Length Summarization”, et al 2024
- “Re-Evaluating GPT-4’s Bar Exam Performance”, 2024
- “Long-Form Factuality in Large Language Models”, et al 2024
- “Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking”, et al 2024
- “RNNs Are Not Transformers (Yet): The Key Bottleneck on In-Context Retrieval”, et al 2024
- “Tokenization Counts: the Impact of Tokenization on Arithmetic in Frontier LLMs”, 2024
- “Chain-Of-Thought Empowers Transformers to Solve Inherently Serial Problems”, et al 2024
- “Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models”, et al 2024
- “Why Are Sensitive Functions Hard for Transformers?”, 2024
- “Chain-Of-Thought Reasoning Without Prompting”, 2024
- “V-STaR: Training Verifiers for Self-Taught Reasoners”, et al 2024
- “More Agents Is All You Need”, et al 2024
- “The Impact of Reasoning Step Length on Large Language Models”, et al 2024
- “Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach”, et al 2023
- “Beyond Human Data: Scaling Self-Training for Problem-Solving With Language Models (ReSTEM)”, et al 2023
- “Tree of Attacks (TAP): Jailbreaking Black-Box LLMs Automatically”, et al 2023
- “Universal Self-Consistency for Large Language Model Generation”, et al 2023
- “Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine”, et al 2023
- “Training Chain-Of-Thought via Latent-Variable Inference”, et al 2023
- “Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks”, et al 2023
- “On Measuring Faithfulness or Self-Consistency of Natural Language Explanations”, 2023
- “Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations”, et al 2023
- “Large Language Models Can Strategically Deceive Their Users When Put Under Pressure”, et al 2023
- “Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves”, et al 2023
- “Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation”, et al 2023
- “Implicit Chain-Of-Thought Reasoning via Knowledge Distillation”, et al 2023
- “Preventing Language Models From Hiding Their Reasoning”, 2023
- “Branch-Solve-Merge Improves Large Language Model Evaluation and Generation”, et al 2023
- “Can GPT Models Be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on Mock CFA Exams”, et al 2023
- “The Expressive Power of Transformers With Chain-Of-Thought”, 2023
- “Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models”, et al 2023
- “Large Language Models Cannot Self-Correct Reasoning Yet”, et al 2023
- “Think Before You Speak: Training Language Models With Pause Tokens”, et al 2023
- “Embers of Autoregression: Understanding Large Language Models Through the Problem They Are Trained to Solve”, et al 2023
- “Contrastive Decoding Improves Reasoning in Large Language Models”, 2023
- “Re-Reading Improves Reasoning in Large Language Models”, et al 2023
- “From Sparse to Dense: GPT-4 Summarization With Chain of Density (CoD) Prompting”, et al 2023
- “Graph of Thoughts: Solving Elaborate Problems With Large Language Models”, et al 2023
- “Solving Challenging Math Word Problems Using GPT-4 Code Interpreter With Code-Based Self-Verification”, et al 2023
- “Android in the Wild: A Large-Scale Dataset for Android Device Control”, et al 2023
- “LLMs As Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines With LLMs”, et al 2023
- “TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT”, et al 2023
- “Question Decomposition Improves the Faithfulness of Model-Generated Reasoning”, et al 2023
- “Measuring Faithfulness in Chain-Of-Thought Reasoning”, et al 2023
- “Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration”, et al 2023
- “Explaining Competitive-Level Programming Solutions Using LLMs”, et al 2023
- “Teaching Arithmetic to Small Transformers”, et al 2023
- “Language Models Are Weak Learners”, et al 2023
- “Let’s Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning”, et al 2023
- “GKD: Generalized Knowledge Distillation for Auto-Regressive Sequence Models”, et al 2023
- “Large Language Models As Tax Attorneys: A Case Study in Legal Capabilities Emergence”, et al 2023
- “Iterative Translation Refinement With Large Language Models”, et al 2023
- “Thought Cloning: Learning to Think While Acting by Imitating Human Thinking”, 2023
- “Let’s Verify Step by Step”, et al 2023
- “Towards Revealing the Mystery behind Chain-Of-Thought: A Theoretical Perspective”, et al 2023
- “Improving Factuality and Reasoning in Language Models through Multiagent Debate”, et al 2023
- “How Language Model Hallucinations Can Snowball”, et al 2023
- “Tree of Thoughts (ToT): Deliberate Problem Solving With Large Language Models”, et al 2023
- “Large Language Model Programs”, et al 2023
- “Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-Of-Thought Prompting”, et al 2023
- “Distilling Step-By-Step! Outperforming Larger Language Models With Less Training Data and Smaller Model Sizes”, et al 2023
- “Decomposition Enhances Reasoning via Self-Evaluation Guided Decoding”, et al 2023
- “LLM+P: Empowering Large Language Models With Optimal Planning Proficiency”, et al 2023
- “Boosting Theory-Of-Mind Performance in Large Language Models via Prompting”, 2023
- “Think Before You Act: Unified Policy for Interleaving Language Reasoning With Actions”, et al 2023
- “Language Models Can Solve Computer Tasks”, et al 2023
- “Reflexion: Language Agents With Verbal Reinforcement Learning”, et al 2023
- “How Well Do Large Language Models Perform in Arithmetic Tasks?”, et al 2023
- “SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models”, et al 2023
- “Language Is Not All You Need: Aligning Perception With Language Models (Kosmos-1)”, et al 2023
- “Multimodal Chain-Of-Thought Reasoning in Language Models”, et al 2023
- “Faithful Chain-Of-Thought Reasoning”, et al 2023
- “Large Language Models Are Versatile Decomposers: Decompose Evidence and Questions for Table-Based Reasoning”, et al 2023
- “ChatGPT Goes to Law School”, et al 2023
- “Large Language Models As Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards”, 2023
- “Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, et al 2023
- “Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes”, et al 2023
- “Solving Math Word Problems With Process & Outcome-Based Feedback”, et al 2022
- “PAL: Program-Aided Language Models”, et al 2022
- “Measuring Progress on Scalable Oversight for Large Language Models”, et al 2022
- “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, et al 2022
- “Large Language Models Can Self-Improve”, et al 2022
- “Challenging BIG-Bench Tasks (BBH) and Whether Chain-Of-Thought Can Solve Them”, et al 2022
- “Self-Ask: Measuring and Narrowing the Compositionality Gap in Language Models (Bamboogle)”, et al 2022
- “Language Models Are Multilingual Chain-Of-Thought Reasoners”, et al 2022
- “ReAct: Synergizing Reasoning and Acting in Language Models”, et al 2022
- “Dynamic Prompt Learning via Policy Gradient for Semi-Structured Mathematical Reasoning”, et al 2022
- “FOLIO: Natural Language Reasoning With First-Order Logic”, et al 2022
- “Faithful Reasoning Using Large Language Models”, 2022
- “Limitations of Language Models in Arithmetic and Symbolic Induction”, et al 2022
- “Language Models Can Teach Themselves to Program Better”, et al 2022
- “Language Model Cascades”, et al 2022
- “CodeT: Code Generation With Generated Tests”, et al 2022
- “Can Large Language Models Reason about Medical Questions?”, et al 2022
- “Inner Monologue: Embodied Reasoning through Planning With Language Models”, et al 2022
- “Exploring Length Generalization in Large Language Models”, et al 2022
- “Language Models (Mostly) Know What They Know”, et al 2022
- “Solving Quantitative Reasoning Problems With Language Models”, et al 2022
- “Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations”, et al 2022
- “Large Language Models Are Zero-Shot Reasoners”, et al 2022
- “Instruction Induction: From Few Examples to Natural Language Task Descriptions”, et al 2022
- “Least-To-Most Prompting Enables Complex Reasoning in Large Language Models”, et al 2022
- “Dialog Inpainting: Turning Documents into Dialogues”, et al 2022
- “Unifying Language Learning Paradigms”, et al 2022
- “Can Language Models Learn from Explanations in Context?”, et al 2022
- “Socratic Models: Composing Zero-Shot Multimodal Reasoning With Language”, et al 2022
- “STaR: Bootstrapping Reasoning With Reasoning”, et al 2022
- “A Conversational Paradigm for Program Synthesis”, et al 2022
- “Self-Consistency Improves Chain-Of-Thought Reasoning in Language Models”, et al 2022
- “Learning-By-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension”, et al 2022
- “PromptChainer: Chaining Large Language Model Prompts through Visual Programming”, et al 2022
- “Chain-Of-Thought Prompting Elicits Reasoning in Large Language Models”, et al 2022
- “Reasoning Like Program Executors”, et al 2022
- “A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More”, et al 2021
- “DREAM: Uncovering Mental Models behind Language Models”, et al 2021
- “Reframing Human-AI Collaboration for Generating Free-Text Explanations”, et al 2021
- “NeuroLogic A✱esque Decoding: Constrained Text Generation With Lookahead Heuristics”, et al 2021
- “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, et al 2021
- “Few-Shot Self-Rationalization With Natural Language Prompts”, et al 2021
- “Training Verifiers to Solve Math Word Problems”, et al 2021
- “Unsupervised Neural Machine Translation With Generative Language Models Only”, et al 2021
- “Show Your Work: Scratchpads for Intermediate Computation With Language Models”, et al 2021
- “AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts”, et al 2021
- “Teaching Autoregressive Language Models Complex Tasks By Demonstration”, 2021
- “Program Synthesis With Large Language Models”, et al 2021
- “Decision Transformer: Reinforcement Learning via Sequence Modeling”, et al 2021
- “Explainable Multi-Hop Verbal Reasoning Through Internal Monologue”, et al 2021
- “A Simple Method to Keep GPT-3 Focused in a Conversation”, 2021
- “Measuring Mathematical Problem Solving With the MATH Dataset”, et al 2021
- “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & 2021
- “How We Accidentally Gave Our Bots Their Personalities”, 2021
- “Word in Context: Agent and Agent Clarification (69% Dev)”, 2020
- “I Found That Getting GPT-3 to Add Its Own “Internal Monologue” in Parentheses to Be a Helpful Strategy…”, blixt 2020
- kleptid @ “2020-07-17”
- kleptid @ “2020-07-17”
- “Inducing Self-Explanation: a Meta-Analysis”, et al 2018
- “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”, et al 2017
- “Why Do Humans Reason? Arguments for an Argumentative Theory”, 2011
- “How to Dramatically Improve the Reasoning Ability of GPT-3”
- “A Preliminary Exploration into Factored Cognition With Language Models”
- “WiC_SelfContextStuffingImproved_Last10_stuft_examplesNV.ipynb”
- “Vincent-163/transformer-Arithmetic”
- “Magic ToDo List Creator”
- “Short Story on AI: ‘Forward Pass’”, 2024
- “AI Dungeon Players Can Now Translate Their Stories into Emojis by Just Clicking a Button.”
- “Solving Math Word Problems: We’ve Trained a System That Solves Grade School Math Problems With Nearly Twice the Accuracy of a Fine-Tuned GPT-3 Model. It Solves about 90% As Many Problems As Real Kids: a Small Sample of 9-12 Year Olds Scored 60% on a Test from Our Dataset, While Our System Scored 55% on Those Same Problems. This Is Important Because Today’s AI Is Still Quite Weak at Commonsense Multistep Reasoning, Which Is Easy Even for Grade School Kids. We Achieved These Results by Training Our Model to Recognize Its Mistakes, so That It Can Try Repeatedly Until It Finds a Solution That Works”
- “Prompting Diverse Ideas: Increasing AI Idea Variance”
- “Teaching a Neural Network to Use a Calculator”
- “Connecting the Dots: LLMs Can Infer & Verbalize Latent Structure from Training Data”
- “Preventing Language Models from Hiding Their Reasoning”
- “Steganography in Chain-Of-Thought Reasoning”
- “Visible Thoughts Project and Bounty Announcement”
- bucketofkets
- Sort By Magic
- Wikipedia
- Miscellaneous
- Bibliography