The tokenized representation of these words doesn't contain information about the pronunciation of a word, or how many syllables it has. For this reason, the model can't know how many syllables there are in each line of the poem. In other words: it can't write haikus.
The tokenizer represents most words in a way the removes and information about how many syllables they have. A typical language model doesn't know what words sound like, although this model seems to sometimes figure it out.
In general, it struggles with meter or anything to do with counting syllables, giving wonderfully nonsensical and contradictory answers on cross-examination.
Try asking it for a joke and then explain the joke. You can learn something about how it relates words with each other that have no obvious relationship for humans. Its really weird.