I've noticed a number of people using AI Dungeon to test GPT-3's abilities. While it's a great way to see how GPT-3 can power an interesting application. It's a poor test of GPT-3's abilities in general. The first generation of any custom prompt is actually GPT-2.

Aug 2, 2020 · 3:31 PM UTC

This was put in place to prevent backdoor access to the OpenAI API. Additionally we have finetuned on a specific dataset and use parameters optimized for our use case making AI Dungeon not necessarily representative of GPT-3 in general.
Replying to @nickwalton00
Are there any other differences you can tell us about? Prepending, separating, or wrapping input? Fine tuning on some story focused corpus? Context size limits? Something else?
We cut off the generation at certain points (trailing sentences etc...) Disable certain tokens to improve performance or make generation safer, fine-tune on text adventures and only use the last ~1000 tokens of context.
Replying to @nickwalton00
You wrote a whole blog post about how you’re using GPT-3 in Dragon. I bought the premium version specifically to get GPT-3. Where did you disclose this previously?
The GPT-2 generation only happens on an extremely small number of requests, specifically when doing a custom prompt on the first generation. Basically the entire game runs on GPT-3.
Replying to @nickwalton00
What do you mean by "the *first* generation of any custom prompt"?
If you do a custom prompt then start a game it will add onto it before you even do an action. That first addition is what I mean.
Replying to @nickwalton00
tbh, you should probably put explicit language there to that effect if there isn't already. I'm seeing more AI Dungeon GPT-3 demos than the ones from the official API.
Yeah that's a good point I'll add that to reduce confusion.
Replying to @nickwalton00
Thanks for the transparency, still amazingly powerful, crazy to think that it could be even more powerful.