“GPT-3 Nonfiction § Calibration”, Gwern2020-06-19 (, , )⁠:

Nonfiction writing by OpenAI’s GPT-3 model, testing logic, commonsense reasoning, anagrams, PDF/OCR cleaning, creative nonfiction, etc

Can you get GPT-3 to express its Q&A uncertainty in the form of probabilities, confidences, or verbal equivalents? Postfixed/prefixed probabilities like “A. answer [60%]” do not work, and neither do postfixed natural estimative words like “A. answer [likely]”, but it seems like prefixed uncertainty words like “A. [likely] answer” may improve results (at least, for asking nonsense, weight, commonsense, and existence questions). > > Later research demonstrated GPT-3-scale models are capable of calibration (Lin et al 2022), and subjective certainty (Kadavath et al 2022).