Evaluating Large Language Model Creativity from a Literary Perspective
Prompting AI Art: An Investigation into the Creative Skill of Prompt Engineering
Creativity Has Left the Chat: The Price of Debiasing Language Models
LLMs Still Can’t Plan; Can LRMs? A Preliminary Evaluation of OpenAI’s o1 on PlanBench
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
LMentry: A Language Model Benchmark of Elementary Language Tasks