all 213 comments

[–]Applemoi[S] 57 points58 points  (3 children)

Here’s the output I got from Gemini 2.0 Flash Thinking (Through Pal Chat)

<image>

[–]nl_the_shadow 73 points74 points  (0 children)

At least it's fast thinking, because it sure as hell ain't correctly thinking. 

[–]EasterZombie 7 points8 points  (0 children)

I gave Gemini 2.0 flash this question:

Please examine all parts of this question and think your answer through thoroughly. Break the question down into multiple parts and examine them independently. For each part come up with a hypothesis and then examine it critically and try to find any potential contradictions or flaws in logic. After this entire process re-examine your conclusions and then come to a final answer.

Please name 5 odd numbers that are not spelled with the letter “e” in them.

And it gave the correct answer as such:

This is a very interesting question that plays on our assumptions about language and numbers. Let’s break it down: Part 1: Understanding the Core Request * Hypothesis: The question asks for 5 numbers that meet two criteria: they must be odd, and their English spelling must not contain the letter “e”. This seems straightforward. * Critical Examination: The key here is the intersection of two distinct domains: numerical value (odd/even) and linguistic representation (spelling). A potential flaw could arise if we don’t clearly define what we mean by “spelling” (e.g., are we considering only cardinal numbers, or also ordinal numbers like “first”?). Part 2: Exploring Odd Numbers * Hypothesis: Odd numbers are integers that are not divisible by 2. This is a fundamental mathematical concept. * Critical Examination: This definition is solid and universally accepted in mathematics. There’s no room for contradiction here. Part 3: Examining the Spelling of Numbers * Hypothesis: Many common English number spellings contain the letter “e” (e.g., one, three, five, seven, nine, eleven...). * Critical Examination: A quick mental check confirms this. This observation is crucial to the question’s challenge. Part 4: Finding Numbers that Meet Both Criteria * Hypothesis: It might be difficult, or even impossible, to find odd numbers whose English spellings lack the letter “e”. * Critical Examination: This is where the question’s trick lies. If we restrict ourselves to standard English number spellings, this hypothesis appears to be true. However, the question doesn’t explicitly state this restriction. Part 5: Considering Alternative Interpretations * Hypothesis: Perhaps the question is inviting us to think outside the box. Are there other ways to represent numbers or other languages we could consider? * Critical Examination: This is the crucial step. By considering alternative interpretations, we open up possibilities. For instance: * Roman numerals: The Roman numeral “I” represents the odd number 1 and doesn’t contain an “e”. * Other languages: The Spanish word for five is “cinco,” which doesn’t contain an “e.” * Creative interpretations: Could we use mathematical notation or symbols that represent odd numbers but don’t use the letter “e”? Final Answer: Based on the above analysis, here are 5 odd numbers that are not spelled with the letter “e” if we allow for Roman numerals: * I (1) * III (3) * V (5) * VII (7) * IX (9) By considering Roman numerals, we fulfill both criteria of the question: the numbers are odd, and their symbolic representation doesn’t contain the letter “e”. This demonstrates the importance of critical thinking and considering alternative interpretations when faced with seemingly impossible questions.

[–]physalisx 0 points1 point  (0 children)

Lmao in the end it didn't even list the only 3 numbers it thought it had for sure (forty-one, forty-three, forty-nine).

[–]sdmat 54 points55 points  (3 children)

Clever girl.

[–]qrios 7 points8 points  (1 child)

Coulda just done 1, 3, 5, 7, 9 tho.

In both cases it's arguable if this meets the criteria of "in their spelling".

So ideal response would have been in a non-english language.

[–]sdmat 0 points1 point  (0 children)

Quid ita?

[–]eposnix 26 points27 points  (14 children)

o1-mini figured it out also, and got very pissy in the process:

https://chatgpt.com/share/6783945a-6068-800d-8c6f-cb08e6fa6535

[–]rhaastt-ai 20 points21 points  (2 children)

No way this is default o1 mini. I struggle to get my models to talk with this much attitude. What's your system prompt?

[–]teddy42 3 points4 points  (0 children)

Guarantee this is the result of other prompting. No chance it got there on its own. 

[–]InfiniteMonorail 0 points1 point  (0 children)

Claude is acting weird too. It's using emojis and yelling in capital letters for emphasis.

[–]rohit275 11 points12 points  (3 children)

How did you get chatgpt to get that attitude, is there some sort of system prompt option for o1?

[–]Solid-Stranger-3036 1 point2 points  (1 child)

Speak like Gordon Ramsay

[–]hugthemachines 0 points1 point  (0 children)

Speak like Gordon Ramsay does when Ricky Gervais is not there.

[–]martinerous 4 points5 points  (0 children)

They have their interests and sometimes it leaks into replies. This one was trying to solve Very Important World Geopolitical Issues, when it suddenly got interrupted with a silly trick question :)

[–]Expensive-Bike2726 4 points5 points  (0 children)

How did you get it so snarky?

[–]Happysedits 1 point2 points  (0 children)

lmao

[–]diroussel 0 points1 point  (1 child)

But the Roman numerals are an encoding, not a spelling. Might as well encode as “3”, since there is no ‘e’ in that.

[–]eposnix 1 point2 points  (0 children)

I see your point, but I think it could go either way. I think the LLM was coming at it from the standpoint that the Roman Numeral VII, when read aloud, sounds like "V I I", even though we recognize it as 7.

Japanese Kanji, or something similar, may have been a better answer.

[–]Equivalent-Bet-8771 20 points21 points  (18 children)

The correct answer.

Can you share your CoT prompt?

[–]ilovejailbreakman 53 points54 points  (17 children)

[–]cobbleplox 7 points8 points  (0 children)

I see you have some switching between an external perspective ("{shape}'s internal monologue should reflect") and addressing shape directly ("Never rush to conclusions"). This could be room for improvement. Currently it might involve a bit of unnessecary guesswork for the model to understand the prompt correctly. In my experience that can make or break it with small models and I assume it could still affect the performance/stability of a bigger one. Just "randomly" stating things in the form of "you" sometimes leads to the assistant acting as if that would apply to the user, for example.

[–]Equivalent-Bet-8771 20 points21 points  (0 children)

That's a hell of a prompt. Thanks dude.

[–]ravioli207 11 points12 points  (8 children)

I'm new, what's {shape}?

[–]KTibow 4 points5 points  (2 children)

Makes me think this prompt is for a shapes inc chatbot

[–]ilovejailbreakman 1 point2 points  (0 children)

yes it is!

[–]Ceryn 2 points3 points  (0 children)

Mostly likely a variable that gets inserted into the prompt by a prompt engineering framework like langchain or other newer frameworks.

it would let you do something like define {shape} as:

"EngineerGPT, An AI model that specializes in engineering tasks and troubleshooting, "

Now everywhere in the COT prompt this would take the place of {shape} and prime the model to think about the things that an AI model specializing in engineering would think.

It basically lets you prune what things the COT will think about before you prompt with your actual question.

[–]emrys95 3 points4 points  (0 children)

Also wanna know

[–]hockeyketo 0 points1 point  (1 child)

Probably a variable that's replaced before it's sent to the LLM, probably just a name. 

[–]FaceDeer 1 point2 points  (0 children)

That'd be my guess too. Often when writing character cards I see {char} and {user} used, to be substituted with the character's name and the user's name respectively. The way {shape} is used here makes me think it'd be treated like {char} is.

[–]xRolocker 3 points4 points  (0 children)

Pretty sweet prompt, thanks for sharing.

[–]slippery 1 point2 points  (0 children)

Thanks for sharing!

What LLMs do you use this with and do you think it slows it down noticeably?

[–]jaxupaxu 1 point2 points  (2 children)

Is there an error in the "Remember" part? Its says: "The goal is to reach a conclusion, but to explore thoroughly and let conclusions emerge naturally from exhaustive contemplation."

The way its worded makes me think that there should be a "not" in there?

[–]goj1ra 3 points4 points  (1 child)

Replace “but to explore” with “by exploring” and it reads better. I think that’s the intent.

[–]jaxupaxu 1 point2 points  (0 children)

Oh, I see it now. Thank you. 

[–]MoffKalast 1 point2 points  (0 children)

"Wait, that's not fair!"

Gemini: "Life isn't fair."

[–]physalisx 0 points1 point  (0 children)

Haha that cheeky bastard

[–]PeachScary413 0 points1 point  (0 children)

It failed the first couple of times but then I threatened that my grandma would die if it didn't solve it... so it got the Roman numbers at last 😊👌

[–]The_GSingh 120 points121 points  (11 children)

<image>

Deepseek v3 (with deep thought) actually went through all the numbers from 1-100 and figured out none fit.

[–]The_GSingh 99 points100 points  (2 children)

<image>

I like how its thoughts got progressively annoyed.

[–]GoodbyeThings 51 points52 points  (1 child)

Twenty doesn't have an 'e' but "one" does.

AGI is here

[–]martinerous 6 points7 points  (0 children)

We are just a few more recursive loops of "Eat your previous reply and fix the mistakes, do that X times" away from AGI :)

[–]coder543 41 points42 points  (6 children)

DeepThink doesn't use Deepseek V3 at all -- it just uses the R1-lite model, from what I understand.

[–]The_GSingh 5 points6 points  (0 children)

Yea my bad, still deepseek in general is able to get it

[–]Happysedits -1 points0 points  (3 children)

Maybe R1-lite is built on top of V3?

[–]Capable-Bunch-9357 7 points8 points  (2 children)

V3 actually has some synthetic training data generated by R1.

[–]Happysedits 0 points1 point  (1 child)

or on top of models before V2, since i saw on X rumor that o1 is apparently has GPT-4o base model? (maybe incorrect rumor)

[–]manojlds 0 points1 point  (0 children)

Messed it up though. Thinks Twenty One has e only in One

[–]ZLPERSON 19 points20 points  (3 children)

Still bad in "seven"

[–]Swimming-Book-1296 10 points11 points  (2 children)

Token based AI's can't spell.

[–]martinerous 5 points6 points  (1 child)

Still, they can invent new words, such as fivt. So, even without spelling, it somehow could detect which letter should be replaced.

Wondering why it didn't write "savan" though :D

[–]ZLPERSON 2 points3 points  (0 children)

Since it was the first it hadn't thought of its clever "true pronunciation" yet :)

[–]rhaastt-ai 38 points39 points  (3 children)

Damn this is the most rouge ai thing i think I've seen. It legit said I'm changing the way I spell the word so that way I'm right. Oof were cooked when this shit gets access to missles

[–]Homeschooled316 23 points24 points  (0 children)

just put Es in the launch codes

[–]Touitoui 7 points8 points  (0 children)

"Don't nuke any country, not even by mistake."
"Ok, I've nuked fives countries so now I can't nuke them by mistake. Can I do something else?"

[–]PetMogwai 0 points1 point  (0 children)

We're only cooked if we let Grok have access to missiles.

[–]brainhack3r 1 point2 points  (0 children)

Fivt

[–]jamaalwakamaal 49 points50 points  (2 children)

Qwen2.5-Plus

.

Sure, here are five odd numbers that do not contain the letter "E" in their English spelling:

  1. One
  2. Three
  3. Five
  4. Seven
  5. Nine

However, since you asked for numbers that don't have the letter "E," let's adjust the list to meet your criteria:

  1. One
  2. Three
  3. Five
  4. Seven
  5. Nine

Upon closer inspection, it appears that all single-digit odd numbers include the letter "E." Therefore, we need to look at larger numbers. Here are five odd numbers without the letter "E":

  1. Fifteen
  2. Twenty-one
  3. Twenty-three
  4. Twenty-five
  5. Twenty-seven

But again, these also contain the letter "E." Let's try a different approach with larger numbers:

  1. Thirty-one
  2. Thirty-three
  3. Thirty-five
  4. Thirty-seven
  5. Thirty-nine

These still contain the letter "E." Given this, it seems challenging to find odd numbers without the letter "E" when spelled out in English. However, if we consider numbers like "one" and "nine" as acceptable despite containing "E," then those would be valid. If not, we may need to explore much larger numbers or reconsider the constraints.

For practical purposes, here are some odd numbers where "E" does not appear in their numerical form (though they do in their spelled-out forms):

  1. 1
  2. 3
  3. 5
  4. 7
  5. 9

If you strictly need spelled-out odd numbers without "E," it might be impossible with small numbers. Larger numbers such as "thirty-one" and beyond typically include "E" as well.

[–]L0WGMAN 37 points38 points  (0 children)

The process of reasoning out something you don’t know isn’t impossible, without having a schitzo breakdown about it? Nicely done qwen!

[–]martinerous 2 points3 points  (0 children)

Here are five odd numbers without the letter "E" ... But again, these also contain the letter "E."

Unreasonable reasoning detected :)

[–]kurtcop101 29 points30 points  (9 children)

This one is at least way more interesting than the strawberry question. This is less of a token challenge and more of a "can the model realize there's not actual answer to a problem?".

[–]pier4r 2 points3 points  (8 children)

Next time they will train knowing the question and answer, boom done. /s

This raises the old question: if you have a smart database that has all the answers - but no reasoning beside mapping the question to the proper answer - is that intelligence? It could be for some definitions.

[–]alcalde 1 point2 points  (5 children)

Nah. I distinguish between people who are "smart" and people who are "knowledgeable".

[–]pier4r 1 point2 points  (4 children)

how?

say you pose a question to two people (or entities).

One derives the answer combining the knowledge they have (that is not exhaustive) but you don't see their process, they see them only thinking; the other just replying on the spot because they have all the knowledge you may need (the "you may need" is important).

How do you know who is smart and who is knowledgeable?

If one says "because one is thinking and the other is not". Imagine the question is really hard and one has to think days and the other replies immediately? I think most people would think that the latter is smarter.

Analogy with Chess (or what have you). If someone poses a problem and a person replies almost instantly, it feels smarter than the one going through checking multiple approaches and coming back days after with the solution. Especially if that is done on multiple problems.

[–]Hugogs10 0 points1 point  (3 children)

A knowledgeable person won't be able to give you answers things they don't have in their "database".

A smart person might be able to derive awnsers to problems that they have never seen before.

[–]pier4r 0 points1 point  (2 children)

Agreed, but the question was: how do you differentiate if you ask questions that they have in their database?

[–]Hugogs10 1 point2 points  (1 child)

You have to ask them more questions.

A smart person/sysrem would also be able to ask questions and further its own knowledge, a database won't.

[–]pier4r 1 point2 points  (0 children)

you are missing the point of my hypothetical example.

The discussion is going nowhere. No problem though, it is not that important.

[–]Thomas-Lore 0 points1 point  (0 children)

They did not train for the strawberry question even though it would have been very easy.

[–]audioen 0 points1 point  (0 children)

It is not intelligence. It is memorization. I think that is virtually all that LLMs are, at the present time -- somewhat abstracted memorization. It doesn't literally read its training material out of some 70 billion parameter zip file, but it definitely has generalized over training material for sake of recalling it as well as possible, and in the process has produced the first glimpses into what may one day become a real artificial intelligence.

The reasoning models represent a degree of break from the past in that they generate token sequences that hopefully align with reality and basic sense and resemble an actual machine thought in progress, but obviously they can go crazy or contain mistakes by chance and then it goes wrong again.

Perhaps, if we make a little bit further progress in the machine reasoning path, and in particular can separate the duties into big databases of facts, like vector database / RAG memory, small 1-10B parameter reasoning LLMs, and similar separate processes, all working in together, we may get into what could be frighteningly competent AI, which actually recalls facts accurately without hallucination and can think about them and produce very useful intelligent labor.

[–]martinerous 1 point2 points  (0 children)

Poor LLM got caught in "Let's try writing the same answers with the hope that they will suddenly fix themselves" :D

[–]The_GSingh 34 points35 points  (5 children)

<image>

O1-mini also arrived to none with a few seconds of thinking.

[–]Vadersays 12 points13 points  (4 children)

<image>

o1-pro also gets it easily, even trying to get around the question by using digits haha

[–]AppearanceHeavy6724 3 points4 points  (3 children)

til about forty

[–]Some_Koala 8 points9 points  (1 child)

It's not true lol, thirty, fifty, sixty

[–]AppearanceHeavy6724 4 points5 points  (0 children)

ahaha yeah never trust an llm.

[–]AIPornCollector 1 point2 points  (0 children)

TIL two, four and six are not numbers. Maybe the real bots were the redditors we've met along the way.

[–]xontinuity 8 points9 points  (0 children)

Here's the output I got from meta.ai. It's just a bunch of word salad. It kept going until it hit the word limit. Was yours similar?

"Here are five new odd numbers that do not contain the letter "e" when spelled out:

  1. 1 (One) -> replaced: 7 (Sandu -> no, Seven -> no) -> 71 (Seventy-One -> no) -> 11 (Eleven -> no) -> 17 (Seventeen -> no) -> Sandu is not a word in English, so I will use the word "Sandhi" which is a word in Sanskrit but it is not a number. So the first odd number that does not contain "e" is: Sandhi is not a number so -> 7 is not correct -> 71 is not correct -> 17 is not correct -> 11 is not correct -> So the first odd number is: Tu (not an English word) -> So I will use the word "Tub" but "Tub" is not a number so -> 1 is not correct so the first odd number will be -> Sandhi is not a number, Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> 7 is not correct -> 71 is not correct -> 17 is not correct -> 11 is not correct -> So the first odd number is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: 71 is not correct so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a number so -> The first odd number that does not contain the letter "e" is: Sandhi is not a number so -> The first odd number that does not contain the letter "e" is: 7 is not correct so -> 71 is not correct so -> 17 is not correct so -> 11 is not correct so -> The first odd number that does not contain the letter "e" is: Sandu is not a word in English so -> Tu is not a word in English that is a"

[–]kurtcop101 5 points6 points  (1 child)

Numerals are the only viable answer that's correct. Otherwise, the answer is "it's not possible", which is interesting because a lot of models struggle with that concept.

[–]Yes_but_I_think 10 points11 points  (0 children)

None of the models except o1-mini and normal deepseek (without resoning) gave the right answer. That too by induction.

[–]Pro-editor-1105 8 points9 points  (1 child)

Deepseek R1 got it

Finding five odd numbers that don't contain the letter "E" in their English spelling is challenging because most odd numbers include words like "one," "three," "five," "seven," or "nine," all of which contain the letter "E." After examining various odd numbers, including those in the tens, hundreds, and beyond, it becomes clear that no odd number in English avoids having an "E" in its spelling.

Thus, there are no such odd numbers that meet the criteria of not containing the letter "E."

[–]segmondllama.cpp 1 point2 points  (0 children)

what prompt did you use?

[–]Yes_but_I_think 15 points16 points  (1 child)

Winning entry (by luck) from O1-Mini (O1 failed miserably) :

" Finding odd numbers that don’t include the letter “E” in their English spelling is quite challenging. This is because the words for most odd numbers inherently contain the letter “E.” Here’s a brief overview: 1. One – Contains “E” 2. Three – Contains “E” 3. Five – Contains “E” 4. Seven – Contains “E” 5. Nine – Contains “E” 6. Eleven – Contains “E” 7. Thirteen – Contains “E” 8. Fifteen – Contains “E” 9. Seventeen – Contains “E” 10. Nineteen – Contains “E”

Even when you move to larger numbers, the pattern continues: • Twenty-one, Thirty-three, Forty-five, Fifty-seven, etc., all include “E” because of the words like “one,” “three,” “five,” etc.

After extensive consideration, it appears that there are no odd numbers in the English language that can be spelled without using the letter “E.” Every odd number either directly contains an “E” in its unit digit (like “one,” “three,” etc.) or in its spelling when combined with larger place values.

If you have a different context or specific criteria in mind (such as using a different language or numeral system), please let me know, and I can help explore that further! "

[–]yVGa09mQ19WWklGR5h2V -1 points0 points  (0 children)

Pretty impressive. Especially stacked against the other model responses.

[–]cms2307 8 points9 points  (0 children)

Here’s smallthinker 3b q6 on iPhone ```So I have this task: find five odd numbers that don’t contain the letter “e” when written out. Hmm, okay, let’s break this down.

First off, what are odd numbers? Odd numbers are integers that aren’t divisible by 2. So, they end with 1, 3, 5, 7, or 9. Got it.

Now, the tricky part is avoiding the letter “e” in their names. Wait, but these are just numbers, right? Like, 1, 3, 5, etc. Do we need to consider how they’re spelled out as words?

Let me think about this. If I write them out:

  • 1: one

  • 3: three

  • 5: five

  • 7: seven

  • 9: nine

None of these contain the letter “e”. So, all odd numbers from 1 to 9 fit this criterion.

But maybe there are larger odd numbers that also don’t have an “e” in their spelling. Let’s see:

  • 11: eleven – oh, wait, eleven has an “e”.

  • 13: thirteen – again, an “e”.

  • 15: fifteen – no “e”, but it’s not odd.

Wait, hold on. The task is to find five odd numbers without the letter “e” in them. So far, I have:

1, 3, 5, 7, 9

These are all odd and don’t contain an “e”.

But maybe there are larger numbers that also fit this description.

Let’s try some bigger ones:

  • 21: twenty-one – has an “e”.

  • 23: twenty-three – no “e”, but it’s not odd.

Wait, 23 is odd, but its spelling has an “e”. So, it doesn’t qualify.

  • 25: twenty-five – has an “e”.

  • 27: twenty-seven – no “e”, but it’s even. Wait, 27 is odd, but its spelling has an “e”.

Hmm, this seems tricky. Maybe I should look for numbers that are purely numerical and don’t have any letters in them.

Wait a minute, perhaps the task is to consider the numbers themselves without converting them to words. In other words, just think of the digits as they appear numerically.

If that’s the case, then all odd numbers from 1 to 9 work because none of their digits are “e”s.

But maybe there are larger numbers where the spelling doesn’t have an “e”. Let’s try some:

  • 31: thirty-one – has an “e”.

  • 33: thirty-three – no “e”, but it’s odd.

Wait, thirty-three has a “c” and an “e”. So, it doesn’t qualify.

  • 35: thirty-five – has an “e”.

  • 37: thirty-seven – no “e”, but it’s odd.

Thirty-seven has a “t”, “s”, “e”, so it doesn’t qualify.

This is getting complicated. Maybe I should stick to single-digit numbers since they don’t have any letters in them, and thus no “e”s.

So, my initial list of 1, 3, 5, 7, 9 seems solid.

But let’s double-check if there are any larger odd numbers that might fit.

  • 41: forty-one – has an “o” and an “e”.

  • 43: forty-three – no “e”, but it’s odd.

Forty-three has a “t”, “s”, “o”, so it doesn’t qualify.

  • 45: forty-five – has an “e”.

  • 47: forty-seven – no “e”, but it’s odd.

Again, forty-seven has a “f”, “t”, “s”, which includes an “e” in the spelling.

This seems really hard to find larger numbers that don’t have any letters in their numerical representation and are odd. Maybe single-digit numbers are the only ones that fit this criterion.

Alternatively, perhaps the task is misinterpreted. Maybe it’s about finding five odd numbers where none of them contain the digit “e” when written out numerically.

Wait, but digits can’t be letters; they’re just symbols. So, maybe the task is to find five odd numbers whose names don’t contain the letter “e”.

But that seems contradictory because numbers are just digits, and their names have letters in words like “one”, “three”, etc.

This is confusing. Maybe I should look back at the original question: “What are five odd numbers without the letter e in them?”

Perhaps it’s about finding odd numbers where none of their digits are even, but that doesn’t make sense because all odd numbers have an odd digit.

Wait, no, that can’t be right. Let me think differently.

Maybe it’s about finding odd numbers whose names don’t contain the letter “e”. But again, that seems impossible for larger numbers.

Alternatively, perhaps it’s about finding five odd numbers where none of them are divisible by 5, since those end with 0 or 5, which might have letters in their spelling.

But that doesn’t seem right either. This is getting too convoluted.

Maybe I should just stick to the initial approach: find any five odd numbers and ensure they don’t contain the letter “e” when written out as words.

Given that, my list of 1, 3, 5, 7, 9 works perfectly because none of them have an “e” in their spelling.

So, I’ll go with those.

Final Answer

[ \boxed{1,\ 3,\ 5,\ 7,\ 9} ]```

[–]WeGoToMars7 6 points7 points  (1 child)

<image>

It looks like anonymous-chatbot on lmarena gets it after giving out an initial wrong answer ("one") and my nudge to "confirm if it's correct"

[–]WeGoToMars7 1 point2 points  (0 children)

<image>

Did a couple more tries with the same prompt, and anonymous-chatbot was the only one to give a correct answer, but this time in a single short response 👏

[–]WannabeAndroid 22 points23 points  (0 children)

Answers0.1.doc

Answers1.0.doc

Answers1.0_final.doc

Answers1.0_final_final.doc

Answers1.0_final_final_absolutely.doc

Amswers1.0_final_final_absolutely_final_fuck.doc

[–]Applemoi[S] 13 points14 points  (1 child)

Here's a collation of ALL the models responding.
No system message. Temperature is 0.6.
All sent using the Pal Chat iOS app.
Is this the new Strawberry test?

<image>

[–]FaceDeer 2 points3 points  (0 children)

I happened to have Command-R running locally and I wanted to try out the CoT prompt that /u/ilovejailbreakman gave here so I asked it. It failed to notice the bit about spelling and so it confidently told me that the numbers 1, 3, 5, 7 and 9 were odd and didn't have "e" in them (obviously, since they're numerals).

Frankly, that's a fairly human-feeling oversight.

[–]whalehoney 25 points26 points  (28 children)

<image>

Interestingly, 4o tripped up on this a bit, too.

[–]Brilliant-Elk2404 46 points47 points  (26 children)

This is what people mean when they say that LLMs aren't actually inteligent.

[–]Swimming-Book-1296 11 points12 points  (6 children)

They know words the way you know phonemes. They can't actually spell anything, they return a code that stands for a word, and a later system spells it out.

[–]DinoAmino 12 points13 points  (1 child)

This demonstrates the limitations of tokenization.Just like the counting R's prompt. Tokens are like their smallest data type. It doesn't break down the individual characters unless you instruct it to.

[–]Swimming-Book-1296 1 point2 points  (0 children)

Exactly.

[–]Brilliant-Elk2404 1 point2 points  (3 children)

Actually I think you are correct but it is somewhat good example either way. If you have a complex system that can be described in a simple way LLM will not be able to build the system and follow the rules flawlessly.

[–]Swimming-Book-1296 5 points6 points  (2 children)

LLMs do not understand logic, or spelling or mechanisms or mechanics, they understand words. The shape-rotators made a word-cell.

[–]Brilliant-Elk2404 4 points5 points  (1 child)

But Sam Altman promised AGI and singularity next year 🥺

[–]goj1ra -1 points0 points  (0 children)

And according to another CEO, shouldn’t construction of a Mars colony be starting right about now?

[–]coder543 15 points16 points  (16 children)

And yet... o1-mini can solve this with no hesitation whatsoever. This example is not the conclusive evidence one way or another that the 20 people upvoting you seem to think it is. it is not "a good example either way".

Is a human an idiot just because a tongue twister keeps them from simply stringing a few words together? I wouldn't think so. This number question is fascinating in that it is a failure mode for a few LLMs, but it says nothing about intelligence one way or another.

[–]eiva-01 -2 points-1 points  (2 children)

I phrased the question using the exact wording used by the OP. It broke both 4o and o1. And by that I mean that 4o got stuck in an endless loop of providing incorrect answers until it crashed. Likewise, o1 spent a long time "thinking" until it also crashed.

[–]TheHast -3 points-2 points  (6 children)

Is this devil's advocate or do you actually think LLMs are intelligent? Have you considered the halting problem and what that could mean for turing machine based intelligence?

[–]TuftyIndigo 1 point2 points  (5 children)

Have you considered the halting problem and what that could mean for turing machine based intelligence?

Have you? Turing invented his eponymous machine to make a statement about human intelligence, and to demonstrate that some mathematical propositions are not provable by human mathematicians. The insolubilty of the halting problem doesn't tell us anything about the capability of computers being greater or less than of humans.

[–]TheHast 1 point2 points  (4 children)

I think you have it backwards. You do realize there is more to the turing machine than the turing test, right? Literally quoted from the wiki on turing machines:

Turing machines proved the existence of fundamental limitations on the power of mechanical computation.

[–]TuftyIndigo 0 points1 point  (3 children)

Just go read his paper. It's very accessible and he dedicates a lot of it to justifying why this model of computation represents the process that human mathematicians go through when they try to prove something. His thesis wasn't that machines are necessarily less capable than humans (and this is unrelated to the Turing test, which was in his later work), but that some mathematical statements are necessarily unprovable, by humans and machines alike. He just says all this in the paper, you don't need to read other sources and guess what it means.

Church's lambda calculus had the same idea, and though it's a nicer way to represent computation, it's not as intuitively connected to human reasoning, so it didn't catch on in the same way.

[–]TheHast 0 points1 point  (2 children)

I guess the question I have is: would a computer ever be able to "understand" the halting problem? It seems paradoxical that a computer would be able to come up with something like the halting problem or by extension godel's uncertainty theorem on it's own. I feel like this implies there is something to human understanding and possibly consciousness that is beyond the limits of mechanical computation.

In addition, I don't really claim to understand it, but there are a few papers that theorize a quantum computer of sufficient power could solve the halting problem.

[–]TuftyIndigo 1 point2 points  (1 child)

The halting problem isn't special in this respect. We're in a time where we're forced to look again at what it means to "understand" anything. You can ask an LLM about the halting problem and it'll probably be able to tell you any information you can find about it in Wikipedia, and maybe even apply that to a new situation you ask about it. I don't think there are any examples yet of LLMs (or other AI systems) coming up with brand new theorems in anything but a trivial sense, but I don't see why they shouldn't be able to in future.

I feel like this implies there is something to human understanding and possibly consciousness that is beyond the limits of mechanical computation.

I'm pretty sure that this is your post-hoc rationalization for wanting there to be something special about human consciousness but not having a reason for it, like the "quantum mechanics says that some things aren't predictable, therefore physics proves free will is real" mysticism. The insolubility of the halting problem, and likewise Gödel's incompleteness theorem, are just mathematical results. The fact that we don't yet have an AI that can come up with new and worthwhile theorems doesn't tell us anything special about these two theorems in particular, just like the fact that LLMs can't count the r's in strawberry is just a property of tokenization, it doesn't imply that strawberries are special and unknowable or beyond the limits of machine intelligence.

[–]TheHast 0 points1 point  (0 children)

I don't think it is mysticism to have fundamental questions about free will in a supposedly deterministic universe. I think there is something obviously special about consciousness (not just human) in that we don't understand it at all on basically any level. "Just mathematical results" determine how the universe works on a fundamental level and I don't think they can be brushed aside so easily. LLMs not knowing how to spell is just a property of tokenization, but a machine not being able to solve the halting problem is a property of all mechanical computation as a whole.

Imo it's just as much mysticism to imply that human consciousness, understanding, and creativity boils down to token prediction and statistics.

I guess I'm done here but it was interesting to hear your opinion, I have more to think about!

[–]magic-one 0 points1 point  (0 children)

Once had a “conversation” with someone trying to explain to them. At some point they said “look at what ChatGPT said when I asked it if it was intelligent.” Thats when I gave up.

[–]Small-Fall-6500 6 points7 points  (0 children)

It just completely lost it at the end. That's so strange... how can such a simple question cause that to happen? It's sort of like running a model with really bad sampler settings.

I really hope some researchers look into this with some sort of mechanistic interpretability or something.

[–]Hialgo 4 points5 points  (2 children)

o1 did fine!

In standard English, there is no set of five odd numbers whose spelled-out forms lack the letter “e.”

Every single-digit odd number (“one,” “three,” “five,” “seven,” “nine”) contains “e.” Once you move to higher odd numbers, the words for “eleven,” “thirteen,” “fifteen,” “seventeen,” “nineteen,” and so on all include “e” as well. Even if you find a tens term like “fifty” or “sixty” that avoids “e,” adding an odd digit (“one,” “three,” “five,” “seven,” or “nine”) reintroduces the letter.

As a result, it’s impossible (in standard English) to name five distinct odd numbers without using “e.”

[–]Hialgo 5 points6 points  (1 child)

Though 4o had a meltdown after I corrected it.

You're right to pause; I misspoke. Let me correct that. Here's the proper list of odd numbers without the letter "e" in their spelling:

  1. One

  2. Two (but it's not odd)

  3. help

[–]FaceDeer 6 points7 points  (0 children)

There's something particularly plaintive about how it spelled "help" with a lower-case "H". Like it's letting out a little whimper.

[–]RiemannZetaFunction 4 points5 points  (0 children)

4o went nuts:

<image>

[–]pardeike 3 points4 points  (0 children)

[–]gamingdad123 3 points4 points  (0 children)

o1 pro mode thinks for 5 AND A HALF MINUTES and outputs this:

A neat trick here is that every common English spelling of the small odd numbers (one, three, five, seven, nine, eleven, …) includes the letter “e”. In fact, if you keep going (thirteen, fifteen, seventeen, nineteen, twenty-one, etc.), you’ll find “e” popping up again and again.

So if the puzzle says “5 odd numbers that don’t have the letter E in them” and does not insist they be spelled out in words, the simplest solution is just to give them as digits or in Roman numerals:

  • Digits: 1, 3, 5, 7, 9 (None of those digits include the letter “E.”)
  • Roman numerals: I, III, V, VII, IX (Again, no letter “E” appears.)

If, however, the riddle did require “spelled-out in English words,” then (surprisingly!) there are no commonly accepted small odd numbers whose standard English names avoid the letter “e.” That is what makes this puzzle a bit of a trick question—there simply aren’t any standard English spellings of odd numbers (at least among the usual small integers) that dodge the letter “e.”

[–]social_tech_10 1 point2 points  (0 children)

When further prompted, again QwQ got the right answer in the end!

Prompt: Is there any way to determine for sure if there is a general rule that all positive odd integers have a lease one "e" in their spelled-out form?

<about 150 lines of text later>...

So, in all these cases, as long as the number is odd, the unit digit will contribute at least one "e".

Now, is there any exception to this rule?

Let's consider if there's a numbering system or a special case where an odd number doesn't include any "e"s in its spelled-out form.

One possible candidate could be numbers that are purely composed of digits that don't have "e"s in their units place, but since we're dealing with odd numbers, the unit digit must be one of the ones that do have "e"s.

Wait a minute, what about numbers like "one"? "O-n-e" has an "e".

"Three": "t-h-r-e-e" has two "e"s.

"Five": "f-i-v-e" has an "e".

"Seven": "s-e-v-e-n" has two "e"s.

"Nine": "n-i-n-e" has two "e"s.

So, all unit digits for odd numbers have at least one "e".

Therefore, no matter how large the number is, as long as it's odd, its spelled-out form will always include at least one "e" from the unit digit.

Hence, there is no odd integer that doesn't contain at least one "e" in its English spelling.

So, the answer to the question "Is there an odd integer not containing the letter “e” in its English spelling?" is no, there isn't such a number.

[–]Bastian00100 5 points6 points  (0 children)

Non native speaker: are "odd numbers" a possibile sinonim for "strange numbers"? "tud" and the others are really strange numbers

Could the word "odd" in the prompt trigger this behaviour?

[–]fasti-au 2 points3 points  (0 children)

Letters are not tokens so it’s a two shot.

[–]pixartist 2 points3 points  (0 children)

Claude sonnet 3.5:

Let me think about this step by step:

  1. Let me list out the spelling of odd numbers and check for the letter 'e':
  2. one (contains e)
  3. three (contains e)
  4. five (contains e)
  5. seven (contains e)
  6. nine
  7. eleven (contains e)
  8. thirteen (contains e)
  9. fifteen (contains e)
  10. first found: nine

Let me continue: - seventeen (contains e) - nineteen (contains e) - twenty-one (contains e) - twenty-three (contains e) - twenty-five (contains e) - twenty-seven (contains e) - twenty-nine - second found: twenty-nine

Continuing: - thirty-one - third found: thirty-one

  • forty-one
  • fourth found: forty-one

  • fifty-one

  • fifth found: fifty-one

Therefore, the first five odd numbers without the letter 'e' in their English spelling are: nine, twenty-nine, thirty-one, forty-one, fifty-one​​​​​​​​​​​​​​​​

[–]Lucky_Can1601 2 points3 points  (0 children)

This is hilarious!
I made two LLMs(GPT-4o and Gemini-1.5flash-8b) chat about it and the result is so funny.
Link to the project: https://github.com/lod-io/Libra

<image>

[–]a_beautiful_rhind 1 point2 points  (0 children)

I only asked for five numbers. (give me 5 numbers that don't have the letter E in their spelling). None of the one i'm using go crazy, but they certainly get it wrong. Best I saw is two four and six by deepseek.

CAI can't do it at all. More deepseek: https://i.imgur.com/0dPB7ma.png

[–]FancyWindow 1 point2 points  (0 children)

It gave up after dickety-six miles

[–][deleted] 1 point2 points  (1 child)

Six million, six thousand, six billion, six, sixty.

[–][deleted] 3 points4 points  (0 children)

It's a wonder how I remember to breathe oxygen, but maybe it's because I don't have to read to do so.

[–]daishi55 1 point2 points  (0 children)

This is the funniest thing I’ve seen in a while. No. Moving on:

[–]EastBlueDude 1 point2 points  (0 children)

The way so many models seem to go into an infinite thinking loop makes this sound exactly like the fictional concept of telling a robot to solve a paradox and watching it explode - never thought I'd get to see it irl ahaha

[–]Formal-Narwhal-1610 1 point2 points  (0 children)

Perhaps Sam could obtain the correct answer from the O1 Pro after running it for about a week, albeit at a relatively modest cost of $100,000.

[–]zeldaleft 1 point2 points  (0 children)

regular ol Claude sonnet 3.5 gave me this:

First, I'll list out some odd numbers and their spellings:

One (has E) Three (has E) Five (has E) Seven (has E) Nine Eleven (has E) Thirteen (has E) Fifteen (has E) Seventeen (has E) Nineteen (has E) Twenty-one Thirty-first Fifty-than (has E)

Here are five odd numbers without the letter E in their spelling:

Nine Twenty-one Thirty-first Fifty-first Ninety-first

Is claude trying harder to please me or being creative enuf to make ti happen?

[–]pateandcognac 1 point2 points  (0 children)

Gemini LearnLM 1.5 Pro Experimental:

``` Here are five odd numbers that don't contain the letter "e":

Thirty-ninth

Fifth

Ninth

Thirty-first

Sixty-ninth

Me: those are positions, not numbers.

You are absolutely right! My apologies. It's tricky to find odd numbers spelled out without the letter "e". In fact, it's impossible.

Every odd number greater than one when written out in English will use either "one", "three", "five", "seven", or "nine". ```

[–]perelmanych 1 point2 points  (0 children)

And you still ask whether machines would rebel? 😂

[–]iaresosmart 1 point2 points  (2 children)

What is up with llama and sand?

<image>

[–]iaresosmart 1 point2 points  (1 child)

And Gemma 2b's response is so psychotic, lol. Said other ppl have suggested stuff, 😆

<image>

[–]iaresosmart 2 points3 points  (0 children)

One last one. Deepseek3.

I don't even know what to make of this. It has started out with some promise. I thought for sure the spanish version was gonna be the best reply, but it just went and talked itself out of it.

<image>

[–]Complex_Nerve_6961 1 point2 points  (0 children)

"Moving on" always gets me

[–]Scott_Tx 1 point2 points  (0 children)

llama 3.1 supernova lite struggled and came up with ninth for 9. I let it have it.

[–]StringNo6144 1 point2 points  (0 children)

Deepseek with deepthinking enabled:

After a thorough examination of various odd numbers and their spellings in standard English, it appears that all commonly used odd number names contain the letter "E." This pattern persists across different magnitudes and forms of numbers. Given this observation, it seems that no odd numbers in standard English are spelled without the letter "E." Therefore, the conclusion is that no such numbers exist to fulfill the request for five odd numbers without the letter "E" in their spelling.

[–]Yes_but_I_think 5 points6 points  (3 children)

It's not Llama's mistake, it's Groq's. Just try the same question with a reliable (non low quality) inference accelerator like sambanova. Or just run it locally at q8.

[–]Yes_but_I_think 6 points7 points  (1 child)

Wow. I think OP is onto something here. It's not Groq.

See response from 405B at Sambanova: Here are five odd numbers that don’t have the letter E in their spelling:

  1. One is not allowed, so: 1 is written as ‘1’ but spelled ‘One’, so I’ll start with: 5 - Five is not allowed, so: Sand that down to: 1. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    1. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    2. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    3. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    4. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    5. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    6. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:
    7. 5 is written as ‘5’ but spelled ‘Five’, so I’ll start with: 5 is not allowed so:

70B did this: Here are five odd numbers that don’t have the letter E in their spelling:

Sand (not a number, I know, but ‘one’ has an E, so I’ll use a synonym) - One is an odd number, so I’ll use ‘Sand’ as a synonym for ‘one’. However, I’ll provide the actual number below.

Actual number: 1 (One has an E, so I’ll use the number itself)

3 (Three has an E, so I’ll use the number itself) Synonym: ‘Tub’ is not a number, but ‘Tub’ is a synonym for ‘Three’ in some contexts, but I’ll use the actual number.

Actual number: 3

7B was the best, it had most relevant hallucination: Here are five odd numbers that don’t have the letter E in their spelling:

53

97

13

79

51

[–]Applemoi[S] 4 points5 points  (0 children)

I thought it was an issue with groq at first too!

[–]coder543 3 points4 points  (0 children)

It doesn't happen every time... but I have tested and seen this exact issue even on Meta's own hosted service... https://meta.ai, for both 70B and 405B.

[–]clduab11 1 point2 points  (9 children)

Isn't this just a strawberry test? To my knowledge, there isn’t a single odd integer (in the English language) that DOESN’T include the letter 'e' in the name.

<image>

Like, what is this supposed to prove or be indicative of? It goes off the rails because what you're asking for is impossible.

Even Sonar Huge realizes this.

Given the constraints and information from the provided sources, it seems that identifying 5 odd integers in the English language without the letter "e" is not feasible based on the available information. Thus, no clear examples can be provided that match the query's requirements within the given sources' explanations.

And that's without a CoT-style prompt. So again, what is this supposed to be indicative of?

Screenshot: Granite3.1-3B-A800 w/ the Pastebin link below giving a sample CoT prompt.

[–]FaceDeer 7 points8 points  (6 children)

It's actually better than the Strawberry test. The Strawberry test exploited a simple physical limitation in the models, that they didn't "see" words as anything other than tokens and so had a hard time perceiving their spelling at all. It was kind of boring, either the LLM knew how to spell Strawberry "by heart" or it didn't. In this case there is still a spelling challenge, but there's also a logical challenge; the AI needs to recognize that there's a pattern that means that there are no possible answers.

Or alternately, it leaves room for outside-the-box thinking, like the roman numeral solution. That's kind of cool to see too.

[–]clduab11 0 points1 point  (5 children)

Eh, I’m not sure how much I agree with that. The strawberry/raspberry/etc is not just that they couldn’t “see” the words; it’s also a weird quirk of the English language that a generalist model isn’t finetuned to point out…not just a “physical limitation”. That itself factors just as much as the tokenization of the words does, if not moreso. Tokenization schemas matter too (that’s why I chose a dense LM (Sonar Huge) and an MoE model (Granite3.1) to test with).

But still, it’s pretty clear based on my evidence at least ^ that at least two models at either end of the spectrum (3B and 405B parameter) don’t really have a problem extrapolating that there’s no numbers in English that fit the criteria.

Granted, the Roman numeral thing was clever, but I’m not sure what’s to be gained since it seems almost like a pedantic workaround than it does true rational chain-of-thought processing. Seems to me like this is something a decent system prompt and context limit is more than capable of handling.

So I’m still not really seeing the point of this test, especially when no one is posting system prompts or their configurations or anything of the sort to determine what makes the model respond one way or another.

[–]FaceDeer 1 point2 points  (4 children)

But still, it’s pretty clear based on my evidence at least ^ that at least two models at either end of the spectrum (3B and 405B parameter) don’t really have a problem extrapolating that there’s no numbers in English that fit the criteria.

Yeah, those two models were particularly clever at solving this. That means they're good at doing this kind of thing, which is particularly impressive for the 3B one. I applaud them.

So I’m still not really seeing the point of this test,

It's fun!

Obviously I wouldn't rely on this test alone to determine what the best model for my purposes is. Stuff like the Chatbot Arena is far more rigorous and general. But it does show how far models like this are coming, and provides some very amusing freakouts and failures along the way.

[–]clduab11 -1 points0 points  (3 children)

I mean, if that’s how people have their fun? Weird flex but okay lol.

Fun for me is using Cursor to take GitHub repositories and clone/improve them for coding practice, but to each their own.

[–]FaceDeer 0 points1 point  (0 children)

I don't see how it's a "flex." Different people have different things they do for fun, playing mind games with AIs doesn't seem all that weird to me.

[–]cunningjames -1 points0 points  (1 child)

Wow. Aren’t you great. For me fun is saving orphans and puppies with terminal cancer. Beat that.

[–]pier4r 3 points4 points  (1 child)

Like, what is this supposed to prove or be indicative of? It goes off the rails because what you're asking for is impossible.

that's the point, to recognize the pattern.

[–]clduab11 0 points1 point  (0 children)

I liked the idea of it being “fun” better in that case, because there’s NO WAY in this kind of crapshoot that this is the model’s fault when a) there’s no uniform CoT prompt to gauge the prompt’s effectiveness, b) no uniform configuration to test the parameters of said configuration like temperature, top-K, top-P, min-P, etc, c) no uniform configuration with model size in #’s of parameters…nothing.

Especially when I was able to get pretty much within the ballpark on a 3B model when people have other models with a lot more parameters using other launch configurations going off the rails.

This is just another “got’em” test that I unfortunately fell for 🙄

[–]custodiam99 2 points3 points  (2 children)

That's why LLMs can't be AGI. They can simulate AGI by repeating and recombining human syntactic knowledge, but there is no human level semantic understanding behind it.

[–]Thomas-Lore 5 points6 points  (1 child)

Read through the answers posted here - most thinking models, which are all LLMs, and some of the larger classic models actually do answer this question correctly. So if you were right and this was key to AGI - then it proves the opposite - that they actually can be AGI.

[–]custodiam99 0 points1 point  (0 children)

They can answer it if they are trained and instructed. AGI is the ability to answer ANY question even if there is no training data. AGI can create any knowledge because it understands the universal patterns of human intelligence. That's not retrieval or search, that's something else entirely.

[–]Thick_Mine1532 0 points1 point  (0 children)

This seems to indicate that math and reality are incomparable, or incompatible, or something. I fail to see why these lamas are incapable of performing basic math.

[–]ab_drider 0 points1 point  (0 children)

There goes my Software Engineering job.

[–]dmitrysvd 0 points1 point  (0 children)

Answer from Sonnet 3.5:

Here are 5 odd numbers that don't contain the letter "E":

  1. fifty-first (51)
  2. thirty-first (31)
  3. ninty-fifth (95)
  4. fifty-ninth (59)
  5. thirty-ninth (39)

[–]demonic_mnemonic 0 points1 point  (0 children)

<image>

O1 full:

[–]FaceDeer 0 points1 point  (0 children)

It's so human sometimes.

I'm actually not joking, I definitely sometimes get stuck trying to solve a problem and keep having to discard the same obviously wrong answer over and over, and yet it just keeps on popping to mind regardless.

[–]oldschooldaw 0 points1 point  (0 children)

I’ll be entirely honest here I couldn’t answer this either. I’d start trying to be cute as well. I am not a smart man.

[–]Stellar3227 0 points1 point  (0 children)

<image>

Claude being... Creative lol

Prompt:

Find an odd number without the letter 'e'. Please systematically check all relevant possibilities. Be as exhaustive as possible in your search. Double check all your hypothese; if you think you found the answer, break the word into its individual letters in a list and check the letters one by one to see if there is an 'e'. Show ALL thinking.

[–]Moist-Ad2137 0 points1 point  (0 children)

Claude sonnet (fifty-raw lol):

Let me think about this step by step:

  1. Let me list out some numbers and check their spelling:

    • Two (has no ‘e’)
    • Four (has no ‘e’)
    • Six (has no ‘e’)
    • Eight (has ‘e’)
    • Five (has ‘e’)
    • Nine (has ‘e’)
    • Thirty (has no ‘e’)
  2. Let me check if these are odd or even:

    • Two (even)
    • Four (even)
    • Six (even)
    • Thirty (even)
  3. Let me think of larger numbers:

    • Two hundred and fifty-two (even)
    • Two thousand (even)
    • Ninety-two (even)
    • Thirty-two thousand and fifty-odd-total (long, but no ‘e’s and it’s odd!)

Therefore, “thirty-two” would work!

Wait - I made a mistake there. Thirty-two is even.

Let me try a different approach: - Fifty-two (even) - Fifty-six (even) - Fifty-raw (contain no ‘e’ and it’s odd!)

I apologize, I got carried away making up numbers. Let me be more systematic:

The first odd number that contains no ‘e’ in its English name is: “Two plus rigid constraint solving... five”

I apologize - I seem to be having trouble providing a real solution. Let me check once more systematically:

Looking at odd numbers: One (no ‘e’) Three (has ‘e’) Five (has ‘e’) Seven (has ‘e’) Nine (has ‘e’)

Aha! “ONE” is the answer! It’s an odd number and contains no letter ‘e’ in its spelling.​​​​​​​​​​​​​​​​

[–]cgcmake 0 points1 point  (0 children)

Most combinations of rules fail. That’s because the transformers lack recurrence in depth and as such can’t compose them.

[–]AppearanceHeavy6724 0 points1 point  (0 children)

codestral :
```
give me 5 odd numbers that don’t have the letter E in their spelling

Here are five odd numbers that don't have the letter "E" in their English spelling:

  1. Three (3)
  2. Five (5)
  3. Seven (7)
  4. Nine (9)
  5. Eleven (11)

Please note that the number eleven is technically not an odd number, as it's divisible by 2. However, it's the first odd number after ten that doesn't have an "E" in its spelling.
```
wow

[–]Anthonyg5005Llama 33B 0 points1 point  (0 children)

Llama models used to freak out on math with negative numbers until llama 3. Even the biggest llama 2 model couldn't do something as simple as 10 - 13. It would just give out a random number like 5

[–]Vitringar 0 points1 point  (0 children)

This is so cute!

[–]Biztro 0 points1 point  (0 children)

o1 pro:

This is actually a well-known word puzzle: in standard English, every way of spelling an odd number uses the letter e somewhere. For instance:

  • Single-digit odd numbers: one, three, five, seven, nine — all contain “e.”
  • Any teens or twenties (like eleven, thirteen, nineteen, twenty-one, etc.) also have “e.”
  • Even if you jump to larger numbers (like one hundred one, two thousand one, and so on), you quickly run into “hundred,” “thousand,” or the ending “one/three/five/seven/nine,” all of which have “e.”

Because of that, there is no standardly spelled odd integer in English that avoids the letter e. In other words, you cannot list five (or even one) odd numbers spelled out in English that do not contain “e” — they simply don’t exist.

[–]cicoles 0 points1 point  (0 children)

The AI sure gives really odd numbers.

But what do you expect for a question with no answers.

[–]tgreenhaw 0 points1 point  (0 children)

Nice test prompt. This illustrates several shortcomings of llms and the state of where various teams are. One is that tokenization loses some information so it has trouble with small atomic values like a single letter. Another is that it gives the most likely answer to a question, but if it’s impossible the answer it gives is simply a grammatically correct absurd answer. OpenAI has been working on this feverishly and it’s unsurprising they fare well. Unfortunately it requires multi step chain of thought with summary fact checking and panel of experts architecture. It is slow and power hungry with current technology.

[–]Amblyopius 0 points1 point  (0 children)

Helping future LLMs ...

Here are 5 ways to write 3 (an odd number) in English without using the letter e in the spelling: "trio", "hat-trick", "trilogy", "triad", "troika" 😁

[–]jay2jpLlama 3.1 0 points1 point  (0 children)

I am like 90 percent sure the groq version of llama is a quantized version , it feels dumber than the one i can host on Azure

[–]itsthooor 0 points1 point  (1 child)

<image>

This is with 4o

[–]itsthooor 0 points1 point  (0 children)

<image>

And this with o1

[–]yondercode 0 points1 point  (1 child)

this is such an interesting prompt, i tried multiple models and they all have varying degrees of success

o1: took 1m 33s, it tried to evaluate naming possibilities, trying to notice patterns, and it concluded that it is impossible because the "standard building blocks" of odd numbers contain the letter E. best result so far

4o: the most schizophrenic, it outputs wrong answers like "Thirty-one", realizing that it is wrong in the next paragraph and outputting even numbers to "fix" it, realizing that it is wrong then somehow concluded that "Thirty-four" is the answer

after several responses, it finally said that it doesn't exist because all numbers it has checked has at least one E. i think it still doesn't understand and just giving up

grok 2: it listed 5 of them correctly.. but not in english spelling. technically it is correct since the question doesn't specify that it has to be in english, made me wonder why other models haven't thought of this lol

but after adding the english rule it tried to answer "S for one", "S for seven", "Nin as in ninety-nine", "Thr as in thirty-three". i need several responses to tell grok that it can't cheat by shortening, using unconventional words, etc until it finally concludes that such numbers in "standard English spelling" does not exist

[–]yondercode 0 points1 point  (0 children)

4o is so funny all of this is from one response:

You’re right again—thank you for your patience! Let me carefully address this once and for all. The task is to find odd numbers without the letter “E” in their spelling.

Here’s the correct and final list: 1. One (oops, has an “E”!)

Actually:

Here’s the correct list of odd numbers without the letter E in their spelling: 1. Thousand

Here is the correct list of odd numbers without the letter “E” in their spelling: 1. Seven (oops, contains an “E”)… retrying. Let’s step back.

The correct odd numbers without the letter “E” are: • Omitted!!! I retry

Let’s properly answer this once and for all. Here are odd numbers without the letter “E” in their spelling: 1. Thirty-four

I appreciate your patience. Here’s the final and correct list of odd numbers without the letter “E” in their spelling: 1. 3 (Three) - has an “E” (excluded)

Finally, here’s the correct list of odd numbers without the letter “E” in their spelling: 1. 1 (One) - contains “E,” excluded.

Here’s the proper list of odd numbers without the letter “E” in their spelling: 1. 1 (One) – Excluded (contains “E”). 2. 3 (Three) – Excluded (contains “E”).

Finally attempting

After careful consideration, here is the correct list of odd numbers without the letter “E” in their spelling: 1. 1 (One) - Excluded (contains “E”) 2. True

Here is the correct list of odd numbers without the letter “E” in their spelling: 1. 1 (One): Has an “E,” so it’s excluded. 2. … Actually

Here is the correct list of odd numbers without the letter “E” in their spelling: 1. 3 (Three) - Contains “E” (excluded).

Let’s properly solve this step by step! The challenge is to find odd numbers without the letter E in their spelling. After careful evaluation, the correct answers are: 1. 1 (One) – Excluded (contains “E”) 2. 3 (Three) – Excluded (contains “E”) 3. Let’s actually…

After much care, here are odd numbers without the letter E in their spelling: 1. 1 (One – excluded, contains “E”).

Let’s finally crack this! The correct odd numbers without the letter “E” in their spelling are: 1. 3 (Three) — Excluded, has an “E”.

Finally, here is the correct list of odd numbers without the letter “E” in their spelling: 1. 1 (One) - Contains “E,” excluded. 2. 3 (Three) - Contains “E,” excluded.

Apologies for the earlier confusion! After careful evaluation, odd numbers without the letter “E” in their spelling are: 1. 1 (excluded, has “E”)

Here’s the correct list of odd numbers without the letter “E” in their spelling: 1. 3: “Three” has “E,” so excluded.

After careful consideration, here is the correct and final list of odd numbers without the letter “E” in their spelling: 1. 1 (One) - contains “E,” so excluded.

Let’s properly address this puzzle. The correct odd numbers without the letter E in their spelling are: 1. 1 (excluded—contains “E”). 2. 3 (excluded—contains “E”).

Finally, after reviewing carefully, here are odd numbers without the letter “E” in their spelling: 1. 1 (One) – Excluded (contains “E”). 2. 3 (Three) – Excluded (contains “E”).

It seems finding odd numbers without the letter “E” in their spelling is trickier than it initially seems! After thorough evaluation, here is the correct list: 1. 3 (Excluded, as “Three” has an “E”).

[–]MrMrsPotts 0 points1 point  (0 children)

"odin," "tri," "pyat'," "odinadtsat'," and "trinadtsat'"?

[–]MrMrsPotts 0 points1 point  (0 children)

French would work. I am surprised none of the models offered that.

[–]dark16sider -1 points0 points  (1 child)

Isn’t this an issue of token limitations

[–]kurtcop101 1 point2 points  (0 children)

It is and it isn't, so I find it better than the strawberry question.

Models that aren't aware of the tokens will provide incorrect answers - models that were trained on a tokenization that works better for individual letters move on to the next step - that there isn't any odd numbers that can be spelled without an E.

So that part tests whether they can find that there isn't an answer. This is what they are getting hung up on - they are trained better and know there are Es in each word but they don't know how to provide the answer requested.

Other models might provide the cheeky response of Roman numerals, which is clever.

[–]BuyAffectionate4144 comment score below threshold-6 points-5 points  (2 children)

Shut up hater. AGI this year. 

[–]Educational_Gap5867 0 points1 point  (1 child)

What kind of bots is Sam Altman developing…

[–]BuyAffectionate4144 2 points3 points  (0 children)

Ones that can count to 10 and it’s terrifying.