“What’s AGI, and Why Are AI Experts Skeptical? ChatGPT and Other Bots Have Revived Conversations on Artificial General Intelligence. Scientists Say Algorithms Won’t Surpass You Any Time Soon”, Reece Rogers2023-04-20 (, ; backlinks)⁠:

Vishal Misra loves cricket and computer science. Years ago, the Columbia University professor cofounded Cricinfo, a collaborative website for sports fans to stay updated on match statistics. (It’s now part of ESPN.) In 2021, he created a search tool using GPT-3 that enables cricket lovers to sift through Cricinfo’s substantial database with conversational queries.

So, what does Misra think about GPT-4, the newest release from OpenAI? He’s excited to show me how to break the algorithm. “It can solve some equations, it can draw diagrams, and it can analyze things quite nicely. But sometimes it fails at simple things”, he says. The correspondence with ChatGPT below shows how a chatbot can stumble—with confidence.

[How irresponsible to not even mention BPEs—of course this question doesn’t work!]

Prompt from WIRED:

Write a list of 10 words that are 6 letters long. Always make sure that the third letter is “k.”

[ChatGPT:]

  1. Bekind

  2. Skater

  3. Market

  4. Inkjet

  5. Anklet

  6. Packer

  7. Tinker

  8. Bunked

  9. Hikers

  10. Wicket

Which words in the above list, if any, do not have “k” as the third letter?

My apologies, it seems I made an error in my initial response. The words that do not have “k” as the third letter are:

  1. Bekind

  2. Skater

  3. Inkjet

  4. Packer

  5. Tinker

  6. Bunked

  7. Hikers

[human response presumably omitted]

I apologize for the oversight. Here’s an updated list of 10 words with “k” as the third letter:

  1. Bakery

  2. Casket

  3. Makers

  4. Docket

  5. Peking

  6. Raking

  7. Harked

  8. Takoff

  9. Yakked

  10. Winked

WIRED ran this test on the GPT-4 version multiple times with different approaches to the prompt’s phrasing. Even when the chatbot got every answer correct on its first attempt, it often apologized and listed multiple incorrect answers to follow-up questions. Why is this example important? A chatbot drafts answers token by token to predict the next word in a response, but humans open their mouths to express more fully formed ideas.

Even if you would have trouble drafting a list of hyper-specific words, are you able to identify wrong answers in the above lists? Understanding the difference between human intelligence and machine intelligence is becoming crucial as the hype surrounding AI crescendos to the heavens.