“Least-To-Most Prompting Enables Complex Reasoning in Large Language Models”, Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Olivier Bousquet, Quoc Le, Ed Chi2022-05-21 (, , , )⁠:

We propose a novel prompting strategy, least-to-most prompting, that enables large language models to better perform multi-step reasoning tasks. Least-to-most prompting first reduces a complex problem into a list of subproblems, and then sequentially solves the subproblems, whereby solving a given sub-problem is facilitated by the model’s answers to previously solved subproblems.

Experiments on symbolic manipulation, compositional generalization and numerical reasoning demonstrate that least-to-most prompting can generalize to examples that are harder than those seen in the prompt context, outperforming other prompting-based approaches by a large margin. A notable empirical result is that the GPT-3 code-davinci-002 [InstructGPT] model with least-to-most-prompting can solve the SCAN benchmark with an accuracy of 99.7% using 14 examples. [condensed notation; also evaluated is PaLM on DROP.] As a comparison, the neural-symbolic models in the literature specialized for solving SCAN are trained with the full training set of more than 15,000 examples.