“How We Accidentally Gave Our Bots Their Personalities”, Latitude2021-02-09 (, ; backlinks)⁠:

Intermediary steps improve GPT-3 performance on understanding context: There are quite a few existing natural language tests for how well an algorithm can differentiate meaning between different contexts. OpenAI’s initial GPT-3 paper tested GPT-3 against several of these…In the paper, OpenAI’s researchers determined that GPT-3 could not do better than chance on these sorts of problems. However, one of our researchers [Matt Brockman] spent some time investigating it last summer [2020-07-308d2020-08-06] and found that providing an intermediary step that re-phrases the inputs helps to improve performance above chance. We suspect a reason this works is because it makes implicit information salient—GPT-3 has a ton of information buried in its neural network, but often a shallow auto-complete pass isn’t enough to make that information relevant to the computation of the next token. By making GPT-3 output this implicit information explicitly as part of its processing, the information is more likely to be used in the final computation of the task.

The way these intermediate steps work is surprisingly simple once you figure out what they need to look like. When you have a task that takes an input and produces an output, normally you’d do your few-shots (the examples you use to train the AI) like

input: example input A output: example output A…input: completion input output:

and GPT-3 will generate the completion output. To get GPT-3 to draw out the implicit information, we simply add an intermediary step as such:

input: example input A reason: what about A should lead to output output: example output A…input: completion input reason:

and GPT-3 will generate the reason before generating the final output. The reasons can be pretty much whatever you want them to be, and you can chain reasons into one another for more complex analysis although we’re not quite sure how many steps different sized models can do; bigger models (eg. Davinci) can match to doing more steps than the smaller ones (eg. Curie).