Commentary on weaknesses in Midjourney’s new ranking-based personalization feature
Hidden Persuaders: LLMs’ Political Leaning and Their Influence on Voters
Do LLMs estimate uncertainty well in instruction-following?
SimpleStrat: Diversifying Language Model Generation with Stratification
Are Large Language Models Consistent over Value-laden Questions?
Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?
Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets
AI Doesn’t Kill Jobs? Tell That to Freelancers: There’s now data to back up what freelancers have been saying for months
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning
Consistency-diversity-realism Pareto fronts of conditional image generative models
Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences
Creativity Has Left the Chat: The Price of Debiasing Language Models
Enhancing Confidence Expression in Large Language Models Through Learning from Past Experience
A Tale of Tails: Model Collapse as a Change of Scaling Laws
The Non-Effect of Sampling Temperature on Problem Solving in GPT-3.5/GPT-4
Does Using ChatGPT Result in Human Cognitive Augmentation?
Experimental Narratives: A Comparison of Human Crowdsourced Storytelling and AI Storytelling
Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
Generative artificial intelligence enhances creativity but reduces the diversity of novel content
When ‘A Helpful Assistant’ Is Not Really Helpful: Personas in System Prompts Do Not Improve Performances of Large Language Models
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4
A Coder Considers the Waning Days of the Craft: Coding has always felt to me like an endlessly deep and rich domain. Now I find myself wanting to write a eulogy for it
Large language models can replicate cross-cultural differences in personality
Assessing the nature of large language models: A caution against anthropocentrism
Simple synthetic data reduces sycophancy in large language models
Can a chatbot preach a good sermon? Hundreds attend church service generated by ChatGPT to find out
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Bits of Grass: Does GPT already know how to write like Whitman?
Inducing anxiety in GPT-3.5 increases exploration and bias
Rewarding Chatbots for Real-World Engagement with Millions of Users
Discovering Language Model Behaviors with Model-Written Evaluations
RL with KL penalties is better viewed as Bayesian inference
Situational Awareness and Out-Of-Context Reasoning § GPT-4-Base Has Non-Zero Longform Performance
Here Are 120K 𝑤 Samples from @AydaoAI’s Large Anime Model (aka TADNE) Clustered into a Set of 256 Centroids. 𝘸𝘢𝘵𝘤𝘩 𝘪𝘵 𝘴𝘩𝘪𝘯𝘦
2024-astolfi-figure1-paretofrontierofqualityvsdiversitytradeoffshowsnoconsistentgaininldmimagegenmodelsovertime.jpg
2024-astolfi-figure2-mscocoexamplesdemonstrationcollapseofdiversityinldmtunedimagegenmodels.png
https://nostalgebraist.tumblr.com/post/706390430653267968/weve-been-talking-about-the-blandness-of
https://nostalgebraist.tumblr.com/post/706441900479152128/novel-writing-chatgpt-vs-code-davinci-002
https://nostalgebraist.tumblr.com/post/728556535745232896/claude-is-insufferable
https://thezvi.wordpress.com/2024/02/27/the-gemini-incident-continues/
https://www.astralcodexten.com/p/constitutional-ai-rlhf-on-steroids
https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2017.00071/full
https://www.lesswrong.com/posts/3ou8DayvDXxufkjHD/openai-api-base-models-are-not-sycophantic-at-any-size
https://www.lesswrong.com/posts/DfqcyGXcFcukYbWZ5/i-measure-google-s-musiclm-over-3-months-as-it-appears-to-go
https://www.lesswrong.com/posts/Fgzh2wLmvsBDmiFcN/sheikh-abdur-raheem-ali-s-shortform?commentId=ZtLC5dTTKrwLJxCBf
https://www.lesswrong.com/posts/MJyud5Qs6MheDemfE/artifex0-s-shortform?commentId=DzQapZEhTHxtjgbxh
https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse#pfHTedu4GKaWoxD5K
https://www.lesswrong.com/posts/tbJdxJMAiehewGpq2/impressions-from-base-gpt-4
https://www.reddit.com/r/ApplyingToCollege/comments/1h0vhlq/in_the_past_three_days_ive_reviewed_over_100/
https://www.reddit.com/r/LocalLLaMA/comments/1ftn6s1/all_llms_are_converging_towards_the_same_point/
https://www.reddit.com/r/LocalLLaMA/comments/1fuxw8d/just_for_kicks_i_looked_at_the_newly_released/
https://www.reddit.com/r/mlscaling/comments/1gyb54z/the_fate_of_gpt4o/
https://www.reddit.com/r/reinforcementlearning/comments/1dhkn9o/creativity_has_left_the_chat_the_price_of/l8xisr7/
https://www.wired.com/story/confessions-viral-ai-writer-chatgpt/
https%253A%252F%252Ftime.com%252F7026050%252Fchatgpt-quit-teaching-ai-essay%252F.html
https%253A%252F%252Fwww.newyorker.com%252Fculture%252Fthe-weekend-essay%252Fwhy-ai-isnt-going-to-make-art.html
Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets
https%253A%252F%252Fwww.thisamericanlife.org%252F832%252Ftranscript%2523act2.html
A Tale of Tails: Model Collapse as a Change of Scaling Laws
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
https%253A%252F%252Fwww.nytimes.com%252Finteractive%252F2023%252F11%252F12%252Fmagazine%252Fandrew-wylie-interview.html.html
Simple synthetic data reduces sycophancy in large language models
https%253A%252F%252Farxiv.org%252Fabs%252F2308.03958%2523deepmind.html
https%253A%252F%252Ftime.com%252F6301288%252Fthe-ai-jokes-that-give-me-nightmares%252F.html
Bits of Grass: Does GPT already know how to write like Whitman?
https%253A%252F%252Farxiv.org%252Fpdf%252F2303.08774%2523page%253D12%2526org%253Dopenai.html
https%253A%252F%252Fwww.lesswrong.com%252Fposts%252Ft9svvNPNmFf5Qa3TA%252Fmysteries-of-mode-collapse-due-to-rlhf%2523Inescapable_wedding_parties.html
RL with KL penalties is better viewed as Bayesian inference
Wikipedia Bibliography: