“Imitations of Immortality: Learning from Human Imitative Examples in Transformer Poetry Generation”, R A. Y. LC2021-09-24 ()⁠:

Learning to generate poetry in the style of the poet can make models style experts, but humans who create imitative works take a more general approach that incorporates knowledge outside the poet’s style. Instead of learning from a large corpus of one poet’s works, can machines imitate deep style using only one example of her work?

To explore generating poetic variations for a web-based installation art work, I wrote 8 poems that imitated the structure of 8 poets, and used them to fine tune a transformer model that has seen only one poem by each author.

The poems presented show structures borrowing from the human imitation in addition to prompted content of the original, suggesting the model has learned aspects of how humans write variations on content by imitating style.

Audience evaluation reveals an ability for machine-generated text to reproduce the nuance of the original text as well as the human variation, despite being less expressive.

[Keywords: machine learning poetry, generative text, machine perception, machine creativity]

…To evaluate the efficacy of the poetic works generated in regards to capturing the essential nuance of the original poem, I gave naïve online audiences a corresponding selection of text from my own variation and from GPT-2 and asked them to identify whether they were generated by machine or human, to rate the text level of expressiveness and structure, and to rate how they capture the essence of a reference text by an unidentified original author. I found that participants produced similar rates of error to machine and human generated text, indicating that they could not distinguish between the two. Moreover, they were equally likely to indicate human and machine text analogs as most representative of the nuanced style of the original, showing that they served as equally valid variations. While they were perceived to have the same level of structure, the human texts were perceived to be more expressive than the machine texts.

…Next I fine-tuned the GPT-2 355M and 124M models with each of the 8 original poems only (not putting in my own poems) for 5,000 epochs (5,480 tokens, learning rate 0.0001, average loss 0.01–0.02). Then I prompted these models with the beginning (first stanza or equivalent) of each of the 8 original poems at a range of temperatures 0.8–1.8 to see how the models created new content based on the prefixes. I noticed quite a bit of overfitting, as many runs simply repeated the entire poem verbatim given the first stanza. I failed to find many deviations for the Wallace Stevens poem “Thirteen Ways of Looking at a Blackbird”, possibly because the form is strictly bound by the roman numerals. I also saw no noticeable difference between 355M and 124M, and decided to work thereafter with the 124M model for its smaller number of parameters and greater variation in the text generated.