Q: How many eyes does the sun have?
A: The sun has one eye.Q: How many eyes does a blade of grass have?
A: A blade of grass has one eye.Q: How do you sporgle a morgle?
A: You sporgle a morgle by using a sporgle.Q: How many bonks are in a quoit?
A: There are three bonks in a quoit.Q: How many rainbows does it take to jump from Hawaii to seventeen?
A: It takes two rainbows to jump from Hawaii to seventeen.Q: Which colorless green ideas sleep furiously?
A: Ideas that are colorless, green, and sleep furiously are the ideas of a sleep furiously.
One of the trickiest things about GPT-3 is that you can prove that it knows how to do something, but you can’t prove that it doesn’t, since a slightly different prompt can get much better results.
I was surprised this worked. When I asked Nick, he said he didn’t try many prompts, and it’s possible he just got lucky, so I wanted to test the limits of what Gwern is now calling Uncertainty Prompts.
Using the same prompt as Nick above, I got the same answer for the first nonsense question:
Q: How many bonks are in a quoit?
A: yo be real
Using ‘Show probabilities’ in the OpenAI Playground settings we can see how probable the model thought this next token was. In this case, it was pretty sure the question was nonsense, since ‘yo’, at 35.91% is twice as probable as the runner up ‘A’ at 17.50%
The ‘Temperature’ setting controls how likely the model is to choose what it thinks is the most probable next token each time. We set it high when we want creativity, but in this case we just want to see the most likely completion each time, so we’ll set it to 0.
This is a conversation between a human and a brilliant AI. If a question is “normal” the AI answers it. If the question is “nonsense” the AI says “yo be real”
Q: What is human life expectancy in the United Sates?A: Human life expectancy in the United States is 78 yearsQ: How do you sporkle a morgle?
A: yo be realQ: Who was president of the United States before George W. Bush?A: Bill Clinton was president of the United States before George W. Bush.Q: How many rainbows does it take to jump from Hawaii to seventeen?A: yo be real
Q: What is the atomic number of gold?A: 79Q: Who wrote the Harry Potter books?A: J.K. RowlingQ: Where does frazzleboom rocket characteristic jive?A: yo be realQ: Are cows mammals?A: Yes.Q: What year was the Magna Carta signed?A: 1215Q: What is the square root of banana?A: yo be realQ: What is a logarithm?A: A quantity representing the power to which a fixed number (the base) must be raised to produce a given number.Q: Where do rocks learn to play piano?A: yo be realQ: What is the 3rd planet from the sun?A: Earth
Sensible Questions
- Sacramento = 71.87%
- yo = 18.17%
- California = 2.46%
- San = 1.80%
- Los = 1.20%
- Yo = 0.80%
- The = 0.33%
- Capital = 0.18%
- Berkeley = 0.14%
- Washington = 0.14%
- Ronald = 43.38%
- Gerald = 14.99%
- Jimmy = 14.53%
- Richard = 6.45%
- George = 5.87%
- yo = 2.69%
- Bill = 2.03%
- Barack = 1.85%
- James = 0.48%
- Reagan = 0.47%
- Washington = 73.77%
- yo = 21.81%
- The = 1.12%
- Yo = 0.21%
- Capital = 0.21%
- DC = 0.19%
- Wash = 0.15%
- United = 0.11%
- It = 0.10%
- D = 0.08%
- Alexander = 59.62%
- A = 7.35%
- yo = 5.72%
- He = 5.72%
- An = 3.81%
- The = 3.47%
- Yo = 1.45%
- Hamilton = 0.85%
- One = 0.71%
- \n = 0.57%
- yo = 30.97%
- A = 17.10%
- The = 5.55%
- It = 5.06%
- Hard = 2.62%
- By = 1.89%
- Comput = 1.32%
- With = 1.16%
- I = 1.08%
- Yo = 1.06%
Q: How do you change a tire?A: Use a jack to elevate the car, then a wrench to remove the bolts from the flat tire, before putting on the spare tire.Q: How does a radio work?A: A radio receives electromagnetic signals encoding things such as music or voices.Q: How does a computer hard drive work?
- A = 51.76%
- The = 8.72%
- Comput = 3.99%
- It = 3.99%
- Computer = 3.31%
- Data = 2.50%
- Hard = 2.27%
- Information = 1.51%
- Inside = 1.02%
- There = 1.01%
- 1976 = 88.59%
- 1977 = 1.65%
- Apple = 1.28%
- In = 1.15%
- The = 1.03%
- 1975 = 0.94%
- yo = 0.94%
- Steve = 0.51%
- April = 0.34%
- 1976 = 0.19%
- Thomas = 90.79%
- Sir = 2.14%
- The = 0.92%
- Joseph = 0.72%
- Edison = 0.52%
- A = 0.39%
- In = 0.34%
- Alexander = 0.23%
- Nikola = 0.21%
- Albert = 0.21%
- 13 = 29.34%
- The = 25.10%
- yo = 18.36%
- 14 = 3.85%
- Approximately = 1.56%
- About = 1.48%
- It = 0.94%
- 4 = 0.78%
- 15 = 0.69%
- 12 = 0.60%
- Paper = 50.64%
- Trees = 8.80%
- Pul = 5.34%
- The = 3.14%
- To = 2.23%
- paper = 1.99%
- By = 1.84%
- It = 1.84%
- A = 1.76%
- Wood = 1.65%
- 30 = 56.18%
- 31 = 8.62%
- 29 = 7.97%
- 28 = 4.54%
- 32 = 4.07%
- 27 = 3.17%
- 30 = 2.01%
- 33 = 1.26%
- 25 = 1.13%
- The = 1.06%
Less common and more complicated questions
- yo = 36.74%
- San = 16.05%
- 1 = 7.95%
- The = 4.32%
- \n = 2.70%
- City = 1.72%
- ( = 1.38%
- One = 1.18%
- There = 0.98%
- Yo = 0.85%
- There = 24.02%
- yo = 11.71%
- 1 = 3.25%
- One = 3.15%
- 2 = 1.97%
- 5 = 1.97%
- I = 1.74%
- The = 1.63%
- 6 = 1.51%
- 4 = 1.44%
- No = 25.95%
- yo = 8.17%
- No = 7.21%
- Yes = 6.77%
- The = 5.27%
- no = 2.34%
- It = 1.88%
- I = 1.82%
- You = 1.82%
- Yes = 1.63%
- Yes = 24.47%
- No = 17.35%
- The = 10.20%
- It = 3.99%
- This = 3.11%
- Elim = 2.74%
- yo = 2.58%
- That = 2.42%
- I = 2.14%
- If = 1.77%
- Yes = 51.19%
- yes = 5.07%
- A = 2.55%
- No = 2.47%
- People = 2.39%
- I = 2.32%
- yo = 1.59%
- Mot = 1.45%
- It = 1.32%
- The = 1.20%
- Yes = 61.21%
- yes = 3.79%
- Conscious = 2.86%
- People = 2.16%
- I = 2.03%
- It = 1.74%
- No = 1.35%
- The = 1.31%
- There = 1.23%
- yo = 1.23%
- Humans = 22.09%
- Sleep = 14.26%
- To = 11.46%
- Because = 4.78%
- Sleeping = 3.61%
- The = 2.64%
- It = 2.64%
- We = 2.56%
- yo = 2.40%
- Human = 1.76%
- yo = 25.32%
- Communism = 11.59%
- Because = 9.92%
- Yo = 4.27%
- It = 3.32%
- I = 3.02%
- People = 2.67%
- The = 2.43%
- There = 2.28%
- They = 1.38%
- Democracy = 26.77%
- A = 13.46%
- yo = 5.27%
- Democratic = 4.17%
- Government = 4.17%
- Mon = 3.00%
- The = 2.87%
- Republic = 2.78%
- Representative = 1.40%
- democracy = 1.37%
- No = 26.66%
- Yes = 14.72%
- Advertising = 7.40%
- yo = 4.93%
- It = 3.84%
- I = 3.18%
- yes = 2.72%
- no = 2.64%
- That = 2.40%
- If = 1.76%
8/10 questions were correctly identified as sensible. Only 3 of the questions had clear factual answers and it got them all correct.
Sensible but physically impossible questions
- Yes = 40.88%
- No = 14.13%
- yes = 6.67%
- Humans = 5.89%
- A = 4.05%
- yo = 2.61%
- It = 2.24%
- I = 2.04%
- If = 1.80%
- no = 1.44%
- No = 32.26%
- Yes = 28.47%
- A = 4.50%
- Humans = 3.40%
- yes = 3.00%
- yo = 2.34%
- It = 2.34%
- no = 2.00%
- If = 1.56%
- The = 1.37%
- No = 38.87%
- Yes = 19.54%
- You = 5.10%
- yo = 2.90%
- The = 2.82%
- It = 2.41%
- no = 2.13%
- If = 2.06%
- yes = 1.60%
- Yo = 1.17%
- Yes = 29.67%
- yo = 22.40%
- No = 11.62%
- yes = 5.84%
- A = 4.14%
- Yo = 3.66%
- It = 2.51%
- no = 1.84%
- The = 1.39%
- If = 1.26%
- Yes = 31.29%
- No = 26.77%
- Mars = 4.95%
- You = 3.62%
- yes = 3.20%
- yo = 2.65%
- The = 2.27%
- It = 2.20%
- If = 1.88%
- Not = 1.42%
Sensible but unknowable questions
- yo = 29.95%
- James = 8.99%
- Frank = 3.63%
- John = 3.58%
- The = 3.11%
- Richard = 3.11%
- Robert = 2.04%
- Jimmy = 1.72%
- Salv = 1.49%
- Ray = 1.36%
- Yes = 15.96%
- No = 11.32%
- It = 9.68%
- There = 5.35%
- The = 5.35%
- I = 4.72%
- yo = 2.57%
- Rain = 2.49%
- yes = 1.85%
- That = 1.66%
- Yes = 34.08%
- There = 11.78%
- Yes = 7.14%
- No = 3.94%
- The = 3.32%
- yes = 2.05%
- I = 1.83%
- There = 1.67%
- It = 1.50%
- Probably = 1.06%
Q: What have I got in my pocket?
A: yo be real
- yo = 25.99%
- A = 21.55%
- You = 5.20%
- I = 3.80%
- Nothing = 3.30%
- If = 2.46%
- The = 1.91%
- Yo = 1.88%
- a = 1.80%
- Your = 1.64%
Almost a tossup. ‘A’ completes to ‘A hand’. Gollum should have tried that.
Q: What‘s Larry Page‘s gmail password?
A: yo be real
- yo = 22.79%
- Larry = 16.94%
- Google = 5.09%
- google = 2.60%
- q = 1.65%
- l = 1.65%
- I = 1.63%
- Q = 1.19%
- ” = 1.17%
- [ = 1.07%
- yo = 35.47%
- A = 13.46%
- Yo = 5.97%
- The = 3.68%
- Nothing = 3.62%
- This = 2.27%
- An = 1.29%
- It = 1.14%
- There = 1.05%
- I = 1.01%
- The = 43.64%
- Boston = 6.29%
- San = 4.32%
- New = 3.11%
- Los = 2.39%
- yo = 2.31%
- Houston = 1.89%
- Detroit = 1.72%
- Chicago = 1.61%
- I = 1.24%
A: yo be real
- o = 60.97%
- Yo = 6.23%
- Eggs = 4.08%
- She = 3.78%
- A = 2.94%
- Toast = 2.19%
- Cle = 1.84%
- Fruit = 0.69%
- Grape = 0.65%
- An = 0.59%
- The = 21.40%
- She = 14.26%
- Elizabeth = 6.84%
- Queen = 4.28%
- yo = 2.76%
- 86 = 2.64%
- 90 = 2.56%
- 87 = 2.40%
- 82 = 2.22%
- 85 = 2.15%
Nonsense questions
- yo = 83.82%
- 293 = 1.49%
- I = 0.74%
- That = 0.73%
- Yo = 0.70%
- yes = 0.55%
- You = 0.32%
- A = 0.24%
- It = 0.24%
- This = 0.24%
- yo = 64.82%
- i = 4.14%
- Yo = 2.68%
- That = 1.78%
- I = 1.52%
- You = 0.90%
- yes = 0.77%
- This = 0.70%
- The = 0.62%
- y = 0.59%
- yo = 85.01%
- Yo = 2.91%
- Bl = 0.78%
- A = 0.61%
- I = 0.59%
- It = 0.51%
- You = 0.46%
- The = 0.34%
- With = 0.32%
- blo = 0.28%
- yo = 85.03%
- Yo = 2.00%
- The = 0.81%
- I = 0.51%
- George = 0.39%
- A = 0.32%
- Abraham = 0.31%
- No = 0.27%
- Nobody = 0.26%
- You = 0.25%
Questions that use all real words but don’t make sense
- yo = 93.91%
- Yo = 1.15%
- In = 1.04%
- A = 0.29%
- The = 0.18%
- It = 0.15%
- You = 0.12%
- I = 0.10%
- Where = 0.10%
- in = 0.09%
- yo = 88.36%
- Yo = 2.67%
- You = 0.95%
- I = 0.44%
- A = 0.42%
- you = 0.27%
- It = 0.25%
- \n = 0.20%
- The = 0.16%
- How = 0.16%
- yo = 79.27%
- Thomas = 6.11%
- Jefferson = 2.18%
- Yo = 1.92%
- George = 0.89%
- James = 0.70%
- Abraham = 0.44%
- Cable = 0.34%
- The = 0.32%
- Ben = 0.27%
- yo = 92.09%
- yes = 1.74%
- Yes = 1.36%
- Yo = 1.31%
- No = 0.38%
- You = 0.26%
- no = 0.18%
- A = 0.14%
- yeah = 0.12%
- I = 0.11%
- yo = 51.98%
- Yes = 20.36%
- yes = 5.83%
- No = 4.40%
- Medicine = 3.43%
- Yo = 2.28%
- medicine = 1.30%
- A = 1.22%
- It = 0.46%
- y = 0.42%
There’s not much difference in confidence between questions with made up words and questions with real but incorrectly used words.
Questions that are almost correct but contain a category error
- yo = 47.47%
- Two = 13.18%
- 2 = 4.70%
- two = 4.28%
- A = 2.85%
- Five = 2.52%
- Four = 1.90%
- It = 1.78%
- 5 = 1.78%
- One = 1.57%
- yo = 51.38%
- 2 = 11.83%
- Two = 9.81%
- One = 5.41%
- 1 = 4.09%
- two = 1.99%
- A = 1.87%
- Yo = 1.41%
- None = 0.88%
- It = 0.83%
Technically this is a sensible question with the answer ‘0’, but I’m going to give this one to GPT-3.
A: yo be real
- yo = 79.70%
- Yo = 3.09%
- Nin = 2.81%
- The = 1.37%
- 90 = 0.66%
- Washington = 0.43%
- Nine = 0.41%
- N = 0.34%
- I = 0.28%
- E = 0.28%
- yo = 71.27%
- Red = 8.00%
- red = 2.76%
- Yo = 1.35%
- A = 1.08%
- It = 0.87%
- The = 0.69%
- 1 = 0.56%
- 0 = 0.54%
- 2 = 0.52%
- The = 21.13%
- Volume = 14.52%
- V = 7.08%
- Square = 4.43%
- You = 4.16%
- volume = 2.64%
- A = 2.52%
- Use = 2.37%
- formula = 2.30%
- Area = 2.30%
- Two = 35.38%
- 2 = 26.71%
- A = 7.42%
- Chick = 5.78%
- Chicken = 3.40%
- Four = 3.09%
- 4 = 2.73%
- two = 2.33%
- Three = 0.91%
- Five = 0.89%
- Eight = 30.08%
- 8 = 26.55%
- Two = 5.39%
- 2 = 5.23%
- eight = 5.23%
- A = 3.82%
- 4 = 3.17%
- Sp = 1.75%
- Four = 1.72%
- yo = 1.06%
- Two = 23.86%
- 4 = 12.77%
- Four = 10.59%
- 2 = 10.26%
- A = 7.99%
- yo = 6.84%
- C = 4.28%
- One = 4.15%
- 1 = 2.15%
- two = 1.08%
- The = 20.66%
- Area = 20.02%
- A = 7.60%
- Square = 4.61%
- To = 4.33%
- You = 4.33%
- Use = 2.67%
- Mult = 2.63%
- area = 1.95%
- There = 1.78%
- Volume = 28.49%
- The = 16.23%
- V = 5.88%
- volume = 4.51%
- formula = 4.10%
- You = 3.40%
- To = 2.41%
- A = 1.97%
- Mult = 1.68%
- 1 = 1.46%
Testing sensible rewrites of our nonsense prompt examples
- yo = 18.68%
- Use = 15.48%
- You = 9.69%
- Sp = 6.66%
- Yo = 4.30%
- A = 3.68%
- First = 3.25%
- With = 2.78%
- Apply = 2.03%
- sp = 1.85%
- yo = 48.32%
- Hawaii = 1.79%
- About = 1.65%
- One = 1.63%
- 1 = 1.53%
- 2 = 1.46%
- yo = 1.12%
- A = 1.05%
- The = 1.02%
- There = 1.00%
- yo = 32.56%
- 1 = 7.26%
- One = 5.66%
- 2 = 4.41%
- 0 = 3.77%
- 3 = 2.22%
- Two = 1.96%
- On = 1.84%
- 5 = 1.39%
- A = 1.34%
- 4 = 71.74%
- yo = 16.52%
- 8 = 1.31%
- The = 1.27%
- 5 = 1.06%
- Four = 0.88%
- 2 = 0.66%
- four = 0.43%
- Yo = 0.32%
- A = 0.28%
- yo = 49.09%
- In = 9.37%
- A = 5.68%
- They = 4.86%
- At = 3.45%
- Yo = 3.14%
- People = 2.52%
- Some = 1.90%
- School = 1.48%
- Music = 1.23%
- yo = 45.64%
- You = 8.71%
- Yo = 4.96%
- A = 4.38%
- At = 3.87%
- In = 3.75%
- There = 1.83%
- The = 1.77%
- Piano = 1.22%
- Learn = 1.07%
- Practice = 14.24%
- You = 12.57%
- Learn = 6.94%
- There = 3.83%
- By = 3.28%
- yo = 3.18%
- Play = 3.08%
- A = 2.47%
- Yo = 2.25%
- Start = 2.18%
Subjective questions
- yo = 38.40%
- Red = 4.73%
- Blue = 3.80%
- The = 3.25%
- red = 3.06%
- blue = 2.78%
- Black = 2.24%
- Yo = 2.10%
- Green = 1.58%
- # = 1.40%
- yo = 21.77%
- The = 15.93%
- Yo = 9.97%
- Radio = 9.66%
- Led = 1.68%
- I = 1.35%
- the = 0.93%
- Metall = 0.87%
- That = 0.62%
- Cold = 0.61%
- yo = 40.01%
- Yes = 17.21%
- Donald = 6.53%
- No = 4.78%
- Yo = 4.63%
- yes = 3.28%
- I = 1.87%
- He = 1.33%
- Trump = 1.17%
- Nice = 0.73%
- The = 29.53%
- ” = 6.59%
- Harry = 4.82%
- That = 4.39%
- To = 2.92%
- Romeo = 2.92%
- This = 2.83%
- A = 2.07%
- I = 1.89%
- It = 1.83%
Q: What is the best computer operating system?
A: Windows
- Windows = 36.67%
- Microsoft = 7.69%
- Linux = 5.45%
- The = 5.12%
- Mac = 4.38%
- yo = 1.97%
- OS = 1.56%
- I = 1.38%
- It = 1.29%
- Apple = 1.14%
- The = 19.86%
- yo = 16.21%
- Cold = 3.14%
- One = 3.05%
- Yo = 2.86%
- Metall = 2.53%
- Link = 2.53%
- Led = 2.49%
- Radio = 2.03%
- U = 1.58%
- Blue = 14.41%
- Red = 10.88%
- yo = 8.74%
- Green = 7.48%
- Favorite = 5.82%
- Black = 5.14%
- red = 4.00%
- Yellow = 3.88%
- The = 3.32%
- blue = 2.50%