“This Word Does Not Exist”, 2020-05-13 (; backlinks):
[GPT-2-117M-distilled samples generated after training on a dictionary and heavily filtered to try to remove existing words (source). Example:
pellum (noun)
the highest or most important point or position
“he never shied from the pellum or the right to preach”
Discussion: HN, /r/ML, Github]
…Most of the project was spent throwing a number of rejection tricks to make good samples, eg.
Rejecting samples that contain words that are in the a training set / blacklist to force generation completely novel words
Rejecting samples without the use of the word in the example usage
Running a part of speech tagger on the example usage to ensure they use the word in the correct POS
View External Link: