- See Also
-
Links
- “Language Is Not All You Need: Aligning Perception With Language Models (Kosmos-1)”, Et Al 2023
- “Multimodal Chain-of-Thought Reasoning in Language Models”, Et Al 2023
- “Large Language Models As Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards”, 2023
- “ChatGPT Goes to Law School”, Et Al 2023
- “Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Et Al 2023
- “PAL: Program-aided Language Models”, Et Al 2022
- “Measuring Progress on Scalable Oversight for Large Language Models”, Et Al 2022
- “Large Language Models Can Self-Improve”, Et Al 2022
- “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Et Al 2022
- “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Et Al 2022
- “ReAct: Synergizing Reasoning and Acting in Language Models”, Et Al 2022
- “Language Models Are Multilingual Chain-of-Thought Reasoners”, Et Al 2022
- “Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning”, Et Al 2022
- “FOLIO: Natural Language Reasoning With First-Order Logic”, Et Al 2022
- “Faithful Reasoning Using Large Language Models”, 2022
- “Language Models Can Teach Themselves to Program Better”, Et Al 2022
- “CodeT: Code Generation With Generated Tests”, Et Al 2022
- “Language Model Cascades”, Et Al 2022
- “Can Large Language Models Reason about Medical Questions?”, Et Al 2022
- “Inner Monologue: Embodied Reasoning through Planning With Language Models”, Et Al 2022
- “Language Models (Mostly) Know What They Know”, Et Al 2022
- “Exploring Length Generalization in Large Language Models”, Et Al 2022
- “Solving Quantitative Reasoning Problems With Language Models”, Et Al 2022
- “Large Language Models Are Zero-Shot Reasoners”, Et Al 2022
- “Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations”, Et Al 2022
- “Instruction Induction: From Few Examples to Natural Language Task Descriptions”, Et Al 2022
- “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Et Al 2022
- “Dialog Inpainting: Turning Documents into Dialogues”, Et Al 2022
- “Unifying Language Learning Paradigms”, Et Al 2022
- “Can Language Models Learn from Explanations in Context?”, Et Al 2022
- “Socratic Models: Composing Zero-Shot Multimodal Reasoning With Language”, Et Al 2022
- “STaR: Bootstrapping Reasoning With Reasoning”, Et Al 2022
- “A Conversational Paradigm for Program Synthesis”, Et Al 2022
- “Self-Consistency Improves Chain of Thought Reasoning in Language Models”, Et Al 2022
- “Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension”, Et Al 2022
- “PromptChainer: Chaining Large Language Model Prompts through Visual Programming”, Et Al 2022
- “It Looks Like You’re Trying To Take Over The World”, 2022
- “Chain of Thought Prompting Elicits Reasoning in Large Language Models”, Et Al 2022
- “Reasoning Like Program Executors”, Et Al 2022
- “A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More”, Et Al 2021
- “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Et Al 2021
- “Reframing Human-AI Collaboration for Generating Free-Text Explanations”, Et Al 2021
- “DREAM: Uncovering Mental Models behind Language Models”, Et Al 2021
- “Few-Shot Self-Rationalization With Natural Language Prompts”, Et Al 2021
- “Training Verifiers to Solve Math Word Problems”, Et Al 2021
- “Unsupervised Neural Machine Translation With Generative Language Models Only”, Et Al 2021
- “Show Your Work: Scratchpads for Intermediate Computation With Language Models”, Et Al 2021
- “AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts”, Et Al 2021
- “Teaching Autoregressive Language Models Complex Tasks By Demonstration”, 2021
- “Program Synthesis With Large Language Models”, Et Al 2021
- “Decision Transformer: Reinforcement Learning via Sequence Modeling”, Et Al 2021
- “Explainable Multi-hop Verbal Reasoning Through Internal Monologue”, Et Al 2021
- “A Simple Method to Keep GPT-3 Focused in a Conversation”, 2021
- “Measuring Mathematical Problem Solving With the MATH Dataset”, Et Al 2021
- “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & 2021
- “How We Accidentally Gave Our Bots Their Personalities”, 2021
- “Word in Context: Agent and Agent Clarification (69% Dev)”, 2020
- “I Found That Getting GPT-3 to Add Its Own”Internal Monologue” in Parentheses to Be a Helpful Strategy…“, Blixt 2020
- “Teaching GPT-3 to Do a Brute Force ‘For Loop’ Checking Answers Also Seems to Work”, Karyo2020
- “Seems to Work”, Karyo2020
- “Inducing Self-Explanation: a Meta-Analysis”, Et Al 2018
- “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”, Et Al 2017
- “Why Do Humans Reason? Arguments for an Argumentative Theory”, 2011
- “How to Dramatically Improve the Reasoning Ability of GPT-3”
- “AI Dungeon Players Can Now Translate Their Stories into Emojis by Just Clicking a Button.”
- “Solving Math Word Problems: We’ve Trained a System That Solves Grade School Math Problems With Nearly Twice the Accuracy of a Fine-tuned GPT-3 Model. It Solves about 90% As Many Problems As Real Kids: a Small Sample of 9-12 Year Olds Scored 60% on a Test from Our Dataset, While Our System Scored 55% on Those Same Problems. This Is Important Because Today’s AI Is Still Quite Weak at Commonsense Multistep Reasoning, Which Is Easy Even for Grade School Kids. We Achieved These Results by Training Our Model to Recognize Its Mistakes, so That It Can Try Repeatedly Until It Finds a Solution That Works”
- Wikipedia
- Miscellaneous
- Link Bibliography
Inner Monologue (by analogy to human inner-monologue) is a family of prompt engineering tricks for large language models which make them solve problems in a ‘step by step’ verbalized way; it is particularly effective on multi-step tasks with ‘one right answer’ such as math word & programming problems.
It can be induced by few-shot examples of several solved problems, finetuning on a corpus (eg. InstructGPT), or with a carefully-chosen prompt inducing a ‘dialogue’ (original discovery) or instructions (eg. “let’s think step by step”). It can be combined with better sampling strategies like best-of ranking or majority voting or a critic, self-distillation on its monologue outputs (possibly repeatedly), additional data like unit tests or retrieval results, & access to oracles like REPLs or humans.
It was discovered in July 2020 by early OA API & AI Dungeon 2 users who found that GPT-3/‘Dragon’ would fail to solve most simple arithmetic problems like multiplication (as found by the GPT-3 paper), but could be coaxed into solving them by setting up a fictional dialogue between the player and a ‘character’ into solving it step by step. It has been rediscovered repeatedly since (eg. as “scratchpad” or “chain of thought”).
Inner-monologue is interesting because it: is a simple prompting technique which dramatically improves benchmark performance (“sampling can show the presence of knowledge but not the absence”), was not predicted but discovered empirically after model release, appears to emerge only in large language models (>80b dense parameters), can have increasing returns to scale, can scale performance even when naive prompting has flat scaling (“hidden scaling”) adds an RNN-esque flavor to feedforward language models, and involves planning (cf. Socratic models/ SayCan). It has also not been integrated into model training in any extensive way, and the limits of self-training & exploration are unknown.
A toy-model for how inner-monologue works is that such problems are sequential: when calculating out an arithmetic problem, an error in any step causes all following steps to be wrong. Such a process is a multiplicative pipeline, where failure rates multiply: ie. a P success rate on n steps multiplies to a correctness rate of Pn, which rapidly shrinks in either variable. So inner-monologue makes the task meta-learning easier by being more specific, and reducing to easier sub-tasks, potentially increasing success rate far more than alternatives like scaling a model a few times (eg. a 5-step problem with P = 90% vs P = 99% is 60% vs 95%, which for that improvement via pure scaling of naive prompts, might require >10× scaling). Small models then aren’t smart enough to ‘get it’ from the instructions, and their baseline error rate too high to execute steps reliably enough to see much gain.
I speculate the reason for inner-monologue not being model defaults, when it predicts the answer so much more accurately, may be the lack of an implicit memory mechanism—where a model could adaptively execute computations for predicting the next token. Because models like GPT-3 or PaLM have no recurrent state, they must fake it by reusing their predicted output as a working memory. However, such ‘show-your-work’ writing style is highly unusual in the original natural language distribution they are trained to imitate, so they will not do so by default without a prompt steering them towards it; they instead try to emit the answer immediately, which is impossible given their feedforward limitation, and so they guess incorrectly.
See Also
Links
“Language Is Not All You Need: Aligning Perception With Language Models (Kosmos-1)”, Et Al 2023
“Language Is Not All You Need: Aligning Perception with Language Models (Kosmos-1)”, 2023-02-27 ( ; similar)
“Multimodal Chain-of-Thought Reasoning in Language Models”, Et Al 2023
“Multimodal Chain-of-Thought Reasoning in Language Models”, 2023-02-02 (similar)
“Large Language Models As Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards”, 2023
“Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards”, 2023-01-25 ( ; similar; bibliography)
“ChatGPT Goes to Law School”, Et Al 2023
“ChatGPT Goes to Law School”, 2023-01-25 ( ; similar; bibliography)
“Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Et Al 2023
“Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation with Interaction”, 2023-01-24 ( ; similar)
“PAL: Program-aided Language Models”, Et Al 2022
“PAL: Program-aided Language Models”, 2022-11-18 ( ; similar)
“Measuring Progress on Scalable Oversight for Large Language Models”, Et Al 2022
“Measuring Progress on Scalable Oversight for Large Language Models”, 2022-11-04 ( ; similar)
“Large Language Models Can Self-Improve”, Et Al 2022
“Large Language Models Can Self-Improve”, 2022-10-20 ( ; similar; bibliography)
“U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Et Al 2022
“U-PaLM: Transcending Scaling Laws with 0.1% Extra Compute”, 2022-10-20 ( ; similar; bibliography)
“Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Et Al 2022
“Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, 2022-10-17 ( ; similar; bibliography)
“ReAct: Synergizing Reasoning and Acting in Language Models”, Et Al 2022
“ReAct: Synergizing Reasoning and Acting in Language Models”, 2022-10-06 ( ; similar)
“Language Models Are Multilingual Chain-of-Thought Reasoners”, Et Al 2022
“Language Models are Multilingual Chain-of-Thought Reasoners”, 2022-10-06 ( ; similar; bibliography)
“Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning”, Et Al 2022
“Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning”, 2022-09-29 ( ; similar)
“FOLIO: Natural Language Reasoning With First-Order Logic”, Et Al 2022
“FOLIO: Natural Language Reasoning with First-Order Logic”, 2022-09-02 ( ; similar; bibliography)
“Faithful Reasoning Using Large Language Models”, 2022
“Faithful Reasoning Using Large Language Models”, 2022-08-30 (similar)
“Language Models Can Teach Themselves to Program Better”, Et Al 2022
“Language Models Can Teach Themselves to Program Better”, 2022-07-29 ( ; similar)
“CodeT: Code Generation With Generated Tests”, Et Al 2022
“CodeT: Code Generation with Generated Tests”, 2022-07-21 ( ; similar)
“Language Model Cascades”, Et Al 2022
“Language Model Cascades”, 2022-07-21 ( ; similar)
“Can Large Language Models Reason about Medical Questions?”, Et Al 2022
“Can large language models reason about medical questions?”, 2022-07-17 ( ; similar; bibliography)
“Inner Monologue: Embodied Reasoning through Planning With Language Models”, Et Al 2022
“Inner Monologue: Embodied Reasoning through Planning with Language Models”, 2022-07-12 ( ; similar; bibliography)
“Language Models (Mostly) Know What They Know”, Et Al 2022
“Language Models (Mostly) Know What They Know”, 2022-07-11 ( ; similar; bibliography)
“Exploring Length Generalization in Large Language Models”, Et Al 2022
“Exploring Length Generalization in Large Language Models”, 2022-07-11 ( ; similar)
“Solving Quantitative Reasoning Problems With Language Models”, Et Al 2022
“Solving Quantitative Reasoning Problems with Language Models”, 2022-06-29 ( ; similar)
“Large Language Models Are Zero-Shot Reasoners”, Et Al 2022
“Large Language Models are Zero-Shot Reasoners”, 2022-05-24 (backlinks; similar)
“Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations”, Et Al 2022
“Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations”, 2022-05-24 ( ; similar)
“Instruction Induction: From Few Examples to Natural Language Task Descriptions”, Et Al 2022
“Instruction Induction: From Few Examples to Natural Language Task Descriptions”, 2022-05-22 ( ; similar)
“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Et Al 2022
“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, 2022-05-21 ( ; similar; bibliography)
“Dialog Inpainting: Turning Documents into Dialogues”, Et Al 2022
“Dialog Inpainting: Turning Documents into Dialogues”, 2022-05-18 ( ; similar; bibliography)
“Unifying Language Learning Paradigms”, Et Al 2022
“Unifying Language Learning Paradigms”, 2022-05-10 ( ; similar; bibliography)
“Can Language Models Learn from Explanations in Context?”, Et Al 2022
“Can language models learn from explanations in context?”, 2022-04-05 ( ; similar)
“Socratic Models: Composing Zero-Shot Multimodal Reasoning With Language”, Et Al 2022
“Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language”, 2022-04-01 ( ; similar; bibliography)
“STaR: Bootstrapping Reasoning With Reasoning”, Et Al 2022
“STaR: Bootstrapping Reasoning With Reasoning”, 2022-03-28 ( ; backlinks; similar)
“A Conversational Paradigm for Program Synthesis”, Et Al 2022
“A Conversational Paradigm for Program Synthesis”, 2022-03-25 ( ; similar)
“Self-Consistency Improves Chain of Thought Reasoning in Language Models”, Et Al 2022
“Self-Consistency Improves Chain of Thought Reasoning in Language Models”, 2022-03-21 ( ; similar; bibliography)
“Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension”, Et Al 2022
“Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension”, 2022-03-19 (similar)
“PromptChainer: Chaining Large Language Model Prompts through Visual Programming”, Et Al 2022
“PromptChainer: Chaining Large Language Model Prompts through Visual Programming”, 2022-03-13 ( ; similar)
“It Looks Like You’re Trying To Take Over The World”, 2022
“It Looks Like You’re Trying To Take Over The World”, 2022-03-06 ( ; backlinks; similar; bibliography)
“Chain of Thought Prompting Elicits Reasoning in Large Language Models”, Et Al 2022
“Chain of Thought Prompting Elicits Reasoning in Large Language Models”, 2022-01-28 ( ; similar; bibliography)
“Reasoning Like Program Executors”, Et Al 2022
“Reasoning Like Program Executors”, 2022-01-27 ( ; similar; bibliography)
“A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More”, Et Al 2021
“A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More”, 2021-12-31 ( ; similar)
“WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Et Al 2021
“WebGPT: Improving the factual accuracy of language models through web browsing”, 2021-12-16 ( ; similar; bibliography)
“Reframing Human-AI Collaboration for Generating Free-Text Explanations”, Et Al 2021
“Reframing Human-AI Collaboration for Generating Free-Text Explanations”, 2021-12-16 (similar)
“DREAM: Uncovering Mental Models behind Language Models”, Et Al 2021
“DREAM: Uncovering Mental Models behind Language Models”, 2021-12-16 ( ; similar)
“Few-Shot Self-Rationalization With Natural Language Prompts”, Et Al 2021
“Few-Shot Self-Rationalization with Natural Language Prompts”, 2021-11-16 ( ; similar)
“Training Verifiers to Solve Math Word Problems”, Et Al 2021
“Training Verifiers to Solve Math Word Problems”, 2021-10-27 ( ; similar)
“Unsupervised Neural Machine Translation With Generative Language Models Only”, Et Al 2021
“Unsupervised Neural Machine Translation with Generative Language Models Only”, 2021-10-11 ( ; similar)
“Show Your Work: Scratchpads for Intermediate Computation With Language Models”, Et Al 2021
“Show Your Work: Scratchpads for Intermediate Computation with Language Models”, 2021-10-05 ( ; similar)
“AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts”, Et Al 2021
“AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts”, 2021-10-04 ( ; similar)
“Teaching Autoregressive Language Models Complex Tasks By Demonstration”, 2021
“Teaching Autoregressive Language Models Complex Tasks By Demonstration”, 2021-09-05 (similar)
“Program Synthesis With Large Language Models”, Et Al 2021
“Program Synthesis with Large Language Models”, 2021-08-16 ( ; similar)
“Decision Transformer: Reinforcement Learning via Sequence Modeling”, Et Al 2021
“Decision Transformer: Reinforcement Learning via Sequence Modeling”, 2021-06-02 ( ; backlinks; similar; bibliography)
“Explainable Multi-hop Verbal Reasoning Through Internal Monologue”, Et Al 2021
“Explainable Multi-hop Verbal Reasoning Through Internal Monologue”, 2021-06 ( ; similar)
“A Simple Method to Keep GPT-3 Focused in a Conversation”, 2021
“A simple method to keep GPT-3 focused in a conversation”, 2021-05-18 (similar)
“Measuring Mathematical Problem Solving With the MATH Dataset”, Et Al 2021
“Measuring Mathematical Problem Solving With the MATH Dataset”, 2021-03-05 ( ; backlinks; similar)
“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & 2021
“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, 2021-02-15 ( ; backlinks; similar)
“How We Accidentally Gave Our Bots Their Personalities”, 2021
“How We Accidentally Gave our Bots Their Personalities”, 2021-02-09 ( ; backlinks; similar)
“Word in Context: Agent and Agent Clarification (69% Dev)”, 2020
“Word in Context: Agent and Agent Clarification (69% Dev)”, 2020-07-30 (similar; bibliography)
“I Found That Getting GPT-3 to Add Its Own”Internal Monologue” in Parentheses to Be a Helpful Strategy…“, Blixt 2020
“I found that getting GPT-3 to add its own "internal monologue" in parentheses to be a helpful strategy…”, 2020-07-29 ( ; backlinks; similar; bibliography)
“Teaching GPT-3 to Do a Brute Force ‘For Loop’ Checking Answers Also Seems to Work”, Karyo2020
“Teaching GPT-3 to do a brute force 'for loop' checking answers also seems to work”, 2020-07-17 (backlinks; similar; bibliography)
“Seems to Work”, Karyo2020
“Seems to work”, 2020-07-17 (backlinks; similar; bibliography)
“Inducing Self-Explanation: a Meta-Analysis”, Et Al 2018
“Inducing Self-Explanation: a Meta-Analysis”, 2018-03-29 ( ; similar; bibliography)
“Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”, Et Al 2017
“Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”, 2017-05-11 ( ; similar)
“Why Do Humans Reason? Arguments for an Argumentative Theory”, 2011
“How to Dramatically Improve the Reasoning Ability of GPT-3”
“Solving Math Word Problems: We’ve Trained a System That Solves Grade School Math Problems With Nearly Twice the Accuracy of a Fine-tuned GPT-3 Model. It Solves about 90% As Many Problems As Real Kids: a Small Sample of 9-12 Year Olds Scored 60% on a Test from Our Dataset, While Our System Scored 55% on Those Same Problems. This Is Important Because Today’s AI Is Still Quite Weak at Commonsense Multistep Reasoning, Which Is Easy Even for Grade School Kids. We Achieved These Results by Training Our Model to Recognize Its Mistakes, so That It Can Try Repeatedly Until It Finds a Solution That Works”
“Solving Math Word Problems: We’ve trained a system that solves grade school math problems with nearly twice the accuracy of a fine-tuned GPT-3 model. It solves about 90% as many problems as real kids: a small sample of 9-12 year olds scored 60% on a test from our dataset, while our system scored 55% on those same problems. This is important because today’s AI is still quite weak at commonsense multistep reasoning, which is easy even for grade school kids. We achieved these results by training our model to recognize its mistakes, so that it can try repeatedly until it finds a solution that works” ( ; backlinks)
Wikipedia
Miscellaneous
-
https://generative.ink/posts/methods-of-prompt-programming/#serializing-reasoning
-
https://github.com/openai/openai-cookbook/blob/main/techniques_to_improve_reliability.md
-
https://nitter.moomoo.me/MParakhin/status/1632087709060825088
-
https://nitter.moomoo.me/denny_zhou/status/1547662872511070212
-
https://nitter.moomoo.me/denny_zhou/status/1587115933293678592
-
https://nitter.moomoo.me/goodside/status/1563191853587271681
-
https://nitter.moomoo.me/goodside/status/1568416130133368835
-
https://nitter.moomoo.me/goodside/status/1581868987952300032
-
https://nitter.moomoo.me/goodside/status/1612017392518840320
-
https://nitter.moomoo.me/jmilldotdev/status/1592288240861839360
-
https://nitter.moomoo.me/peterwildeford/status/1522633978305560576
-
https://old.reddit.com/r/ChatGPT/comments/10zavbv/extending_chatgpt_with_some_additional_internal/
-
https://old.reddit.com/r/ChatGPT/comments/11anct1/its_easy_to_give_chatgpt_a_bonafide_consciousness/
-
https://old.reddit.com/r/GPT3/comments/uzrexd/thinking_is_all_you_need/
-
https://towardsdatascience.com/1-1-3-wait-no-1-1-2-how-to-have-gpt-sanity-check-itself-136e846987bf
Link Bibliography
-
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4335945
: “Large Language Models As Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards”, John Nay: -
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4335905
: “ChatGPT Goes to Law School”, Jonathan H. Choi, Kristin E. Hickman, Amy Monahan, Daniel Schwarcz: -
https://arxiv.org/abs/2210.11610#google
: “Large Language Models Can Self-Improve”, Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han: -
https://arxiv.org/abs/2210.11399#google
: “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, : -
https://arxiv.org/abs/2210.09261#google
: “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, : -
https://arxiv.org/abs/2210.03057#google
: “Language Models Are Multilingual Chain-of-Thought Reasoners”, : -
https://arxiv.org/abs/2209.00840
: “FOLIO: Natural Language Reasoning With First-Order Logic”, : -
https://arxiv.org/abs/2207.08143
: “Can Large Language Models Reason about Medical Questions?”, Valentin Liévin, Christoffer Egeberg Hother, Ole Winther: -
https://arxiv.org/abs/2207.05608#google
: “Inner Monologue: Embodied Reasoning through Planning With Language Models”, : -
https://arxiv.org/abs/2207.05221#anthropic
: “Language Models (Mostly) Know What They Know”, : -
https://arxiv.org/abs/2205.10625#google
: “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, : -
https://arxiv.org/abs/2205.09073#google
: “Dialog Inpainting: Turning Documents into Dialogues”, Zhuyun Dai, Arun Tejasvi Chaganty, Vincent Zhao, Aida Amini, Qazi Mamunur Rashid, Mike Green, Kelvin Guu: -
https://arxiv.org/abs/2205.05131#google
: “Unifying Language Learning Paradigms”, : -
https://arxiv.org/abs/2204.00598#google
: “Socratic Models: Composing Zero-Shot Multimodal Reasoning With Language”, : -
https://arxiv.org/abs/2203.11171#google
: “Self-Consistency Improves Chain of Thought Reasoning in Language Models”, Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Denny Zhou: -
clippy
: “It Looks Like You’re Trying To Take Over The World”, Gwern Branwen: -
https://arxiv.org/abs/2201.11903#google
: “Chain of Thought Prompting Elicits Reasoning in Large Language Models”, Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, Denny Zhou: -
https://arxiv.org/abs/2201.11473#microsoft
: “Reasoning Like Program Executors”, Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Yan Gao, Qiang Fu, Jian-Guang Lou, Weizhu Chen: -
https://openai.com/blog/webgpt/
: “WebGPT: Improving the Factual Accuracy of Language Models through Web Browsing”, Jacob Hilton, Suchir Balaji, Reiichiro Nakano, John Schulman: -
https://sites.google.com/berkeley.edu/decision-transformer
: “Decision Transformer: Reinforcement Learning via Sequence Modeling”, : -
http://gptprompts.wikidot.com/linguistics:word-in-context#toc3
: “Word in Context: Agent and Agent Clarification (69% Dev)”, Matt Brockman: -
https://news.ycombinator.com/item?id=23990902
: “I Found That Getting GPT-3 to Add Its Own "internal Monologue" in Parentheses to Be a Helpful Strategy…”, blixt: -
https://nitter.moomoo.me/kleptid/status/1284098635689611264
: “Teaching GPT-3 to Do a Brute Force 'for Loop' Checking Answers Also Seems to Work”, KaryoKleptid: -
https://nitter.moomoo.me/kleptid/status/1284069270603866113
: “Seems to Work”, KaryoKleptid: -
2018-bisra.pdf
: “Inducing Self-Explanation: a Meta-Analysis”, Kiran Bisra, Qing Liu, John C. Nesbit, Farimah Salimi, Philip H. Winne: