Creative writing by OpenAI’s GPT-3 model, demonstrating poetry, dialogue, puns, literary parodies, and storytelling. Plus advice on effective GPT-3 prompt programming & avoiding common errors.
Compared to GPT-2, GPT-3 improves performance on character-level tasks like rhyming, alliteration, punning, anagrams or permutations, acrostic poems, and arithmetic less than expected, despite being very good at many other closely-related kinds of writings like satire.
Why? A plausible explanation is an obscure technical detail: as a performance optimization, GPT does not see characters but ~51k word or sub-word-chunks called “byte-pair encodings” (BPEs). A BPE can range from an individual letter like “e”, to words like “nine” (BPE #30,888 in the OA GPT-2 BPE vocab), to horrifying things like “rawdownloadcloneembedreportprint” (BPE #30,906). The number “10” might be encoded as just “10” (BPE #940), or it might be encoded as the token “1” (#16) followed by “0” (#15); the number 70710 (no commas!) might be encoded as “70710” (BPE #42,877) or… as quite a lot of different possible sequences of BPEs.
Because GPTs never see characters but opaque partial-words, which vary chaotically based on the specific word and even the surrounding context, they are unable to easily learn about character-level aspects of language, like similar spellings or sounds, and are forced to learn relationships much more indirectly, like by brute-force memorizing of pairs of words.
Some experiments with reformatting GPT-3’s poorest-performing tasks to avoid inconsistent BPE encodings of strings shows small to large performance gains, consistent with this theory.