“Gpt-2-Poetry”, 2019-03-04 (; backlinks; similar):
I used
download-urls.pyto quickly download the HTML from poetryfoundation.org based on the URLs inromantic-urls.txt.Then I used
Parse Poetry.ipynbto parse the HTML and extract the title, author, and poem. There are some glitches here with newlines being rendered in some places they shouldn’t, and not being rendered in places where they should. This notebook saves a bunch of text files to output/ that include metadata as the first few lines.Then I used
Generate GPT-2.ipynbto generate poems based on random chunks from the poems and the seed words. This notebook saves files topoems.jsonandgenerated.json. To run this notebook, first get GPT-2 running, and drop the notebook in thegpt-2/src/directory.Both Python notebooks import from
utilswhich I have separately pushed here.Finally, I load
generated.jsonandpoems.jsonwith JavaScript inindex.htmland display the results.