The tokenized representation of these words doesn't contain information about the pronunciation of a word, or how many syllables it has. For this reason, the model can't know how many syllables there are in each line of the poem. In other words: it can't write haikus.

more replies

Justin Kazmark · Dec 9, 2022 · 11:18 PM UTC

Justin Kazmark @jkaz

9 Dec 2022

Replying to @tomgoldsteincs

Any insight into why it can’t get haiku right? I noticed this, too… seems like a simple task to learn syllable count?

Tom Goldstein · Dec 10, 2022 · 12:45 AM UTC

Tom Goldstein @tomgoldsteincs

10 Dec 2022

The tokenizer represents most words in a way the removes and information about how many syllables they have. A typical language model doesn't know what words sound like, although this model seems to sometimes figure it out.

Jordan · Dec 13, 2022 · 2:40 AM UTC

Jordan @jordangray

13 Dec 2022

Replying to @tomgoldsteincs

In general, it struggles with meter or anything to do with counting syllables, giving wonderfully nonsensical and contradictory answers on cross-examination.

d0m0ran · Dec 9, 2022 · 7:39 PM UTC

d0m0ran @d0m0ran

9 Dec 2022

Replying to @tomgoldsteincs

Try asking it for a joke and then explain the joke. You can learn something about how it relates words with each other that have no obvious relationship for humans. Its really weird.

Michael P. Frank 💻🔜♻️ e/acc · Dec 10, 2022 · 7:23 AM UTC

Michael P. Frank 💻🔜♻️ e/acc @MikePFrank

10 Dec 2022

Replying to @tomgoldsteincs

To be fair, the true haiku luminaries aren’t quite so strict about the syllable count

Michael P. Frank 💻🔜♻️ e/acc · Dec 10, 2022 · 7:26 AM UTC

Michael P. Frank 💻🔜♻️ e/acc @MikePFrank

10 Dec 2022

Replying to @tomgoldsteincs

Imo this is a perfectly fine haiku. The exact syllable count is really not all that important to the form