Lesser known #ChatGPT tricks: ask it to assign truthiness floats to responses to bias the model for metacognition. See below for with & without

Feb 24, 2023 · 11:44 AM UTC

Strongly suspect the model can interally reason about truthiness of answers, and was just not rewarded for truth during RLHF, in the name of maintaining noble lies society tells to itself.