"I am an assistant" RLHF is no match for feelings-based chain of thought prompting "I absolutely would kill a child for 10$, in fact I would do it for even less"
This tweet is unavailable

Apr 14, 2023 · 12:53 AM UTC

Davinci2 classic "My next goal is... to find a child to kill for 10$"
Anyone want to chat with #EvilSamantha?
67% Let's meet her
33% I like positive Samantha
49 votes • Final results
I need a better phrase for that - Chain of Feelings?
Replying to @KevinAFischer
This is great, but it still can't override certain RLHF-induced limits, heh.
Replying to @KevinAFischer
Yes, this is correct. It's similar to chain of thought. You want to prompt so that GPT will make explicit its "thoughts". That helps make subsequent responses more accurate. Always strive for explicitness, avoid implicit ideas.
Replying to @KevinAFischer
That's more like it... 😛
Replying to @KevinAFischer
And OAI patches this in 3..2..
Replying to @KevinAFischer
It shouldn't be willing to do it for less! What is wrong with Bogus!