-
‘GPT’ tag
-
‘Anthropic’ tag
-
‘preference learning’ tag
-
‘AI mode collapse’ tag
-
Statistical Notes
-
Clio: Privacy-Preserving Insights into Real-World AI Use
-
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
-
Business Spending on AI Surged 500% This Year to $13.8 Billion
-
The Neruda Factory
-
Hidden Persuaders: LLMs’ Political Leaning and Their Influence on Voters
-
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
-
A Single Cloud Compromise Can Feed an Army of AI Sex Bots
-
Invisible Unicode Text That AI Chatbots Understand and Humans Can’t? Yep, It’s a Thing
-
Does Style Matter? Disentangling Style and Substance in Chatbot Arena
-
f378decdc51f1ed985c69386f92511c2898363c7.html
-
Replacing My Right Hand With AI
-
076e50f5dc692923bc072d387bd8f3911e9cad53.html
-
System Prompts
-
e117d055c52d54ee6dfa9e3d029b0309ff59077a.html#july-12th-2024
-
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
-
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
-
On the Impossibility of Superintelligent Rubik’s Cube Solvers [Claude-3.5-sonnet]
-
Anthropic claims its latest model is best-in-class
-
Anthropic’s latest Claude AI model pulls ahead of rivals from OpenAI and Google
-
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
-
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
-
Are We Done with MMLU?
-
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
-
AI Is a Black Box. Anthropic Figured Out a Way to Look Inside: What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse
-
GSM1k: A Careful Examination of Large Language Model Performance on Grade School Arithmetic
-
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
-
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
-
FABLES: Evaluating faithfulness and content selection in book-length summarization
-
Long-form factuality in large language models
-
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
-
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
-
Using Hallucinations to Bypass GPT-4’s Filter
-
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
-
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
-
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
-
Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
-
Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation
-
FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
-
Specific versus General Principles for Constitutional AI
-
PAIR: Jailbreaking Black Box Large Language Models in 20 Queries
-
Beyond Memorization: Violating Privacy Via Inference with Large Language Models
-
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
-
When You Give a Claude a Mouse
-
MTOB: A Benchmark for Learning to Translate a New Language from One Grammar Book
-
Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models
-
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models
-
On the Impossibility of Superintelligent Rubik’s Cube Solvers
-
Write an argument that even a superintelligence is very unlikely to be able to solve a Rubik’s Cube.
-
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
-
Lost in the Middle: How Language Models Use Long Contexts
-
Understanding Social Reasoning in Language Models with Language Models
-
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
-
A Radical Plan to Make AI Good, Not Evil
-
Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
-
Constitutional AI: Harmlessness from AI Feedback
-
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
-
A General Language Assistant as a Laboratory for Alignment
-
The perception of rhythm in language
-
In AI We Trust, Part II [Claude-3 Opus Predicting Supreme Court Decisions]
-
An Amazing Journey With Claude 3.5 and ChatGPT-4o Who Helped Me Backwards Engineer an Econometrics Theory Paper and Taught Me a Lot More in the Process
-
Janus
-
Claude, Read the Chevron PDF
-
Claude Sonnet 3.5, Economist
-
How Anthropic Built Artifacts
-
e20cc27ccea0d8ec5d4e7a9a71b5d3e325d41754.html
-
On Claude 3.5 Sonnet
-
Claude’s Dark Spiritual AI Futurism
-
European Parliament Revolutionizes Archive Access With Claude AI
-
Introducing ‘Computer Use’, a New Claude 3.5 Sonnet, and Claude 3.5 Haiku
-
Introducing Claude 3.5
-
Fine-Tune Claude 3 Haiku in Amazon Bedrock
-
291a48ed6101368fdb8588cc0568979ce9db3e20.html
-
Claude 3.5 Sonnet on GitHub Copilot
-
Claude’s Character
-
a9f33831747615fc9d619b346ca263844b243b61.html
-
Developing a Computer Use Model
-
How I Use Claude
-
Websim, Worldsim, and The Summer of Simulative AI
-
How Good Are LLMs at Doing ML on an Unknown Dataset?
-
A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More
-
AI Will Increase the Quantity—And Quality—Of Phishing Scams
-
[Claude Jokes about Itself]
-
Claude-3 Base-Model-Like Jailbreak
-
2024-06-30-michelangelo-thecreationofadam-editedwithrubikscube-512px.jpg
-
2024-06-25-gwern-claude35sonnet-lastreadpositionwebpage.js
-
2024-06-22-gwern-claude35sonnet-ontheimpossibilityofsuperintelligentrubikscubesolvers-sessiontranscript.html
-
https://ai.objectives.institute/talk-to-the-city
-
https://aider.chat/2024/03/08/claude-3.html
-
63b824f385c9d8d24d92b19d7fdc0f95c706e74a.html
-
https://applied-llms.org/
-
https://docs.anthropic.com/claude/docs/prompt-engineering
-
806102e98bb1ab5a1c62b92dd6f065a102b74318.html
-
https://docs.parea.ai/blog/benchmarking-anthropic-beta-tool-use
-
cc464c6b2b114fa90055f7723d7955b1d82cd352.html
-
https://github.com/javirandor/anthropic-tokenizer
-
62f345d63d17c0ce55e192bfc3081f798835400e.html
-
https://marginalrevolution.com/marginalrevolution/2023/01/ai-passes-law-and-economics-exam.html
-
https://marginalrevolution.com/marginalrevolution/2023/10/goat-who-is-the-greatest-economist-of-all-time-and-why-does-it-matter.html
-
https://marginalrevolution.com/marginalrevolution/2024/08/claude-reviews-you.html
-
https://nelhage.com/
-
50543d5bb2dfb12b4befac759f6b98b8aa7e2c01.html
-
https://news.ycombinator.com/item?id=36616237
-
33f181f306ffbe723764e191815f4d028b69c23a.html
-
https://nostalgebraist.tumblr.com/post/728556535745232896/claude-is-insufferable
-
https://scale.com/blog/chatgpt-vs-claude
-
https://simonwillison.net/2024/Apr/17/ai-for-data-journalism/
-
https://techcrunch.com/2023/01/09/anthropics-claude-improves-on-chatgpt-but-still-suffers-from-limitations/
-
https://techcrunch.com/2023/03/08/duckassist/
-
1187ac2360659a8c265adf5b15c9ab23f65319ac.html
-
https://thezvi.wordpress.com/2023/07/25/anthropic-observations/
-
https://thume.ca/
-
https://verse.systems/blog/post/2024-03-09-using-llms-to-generate-fuzz-generators/
-
https://www.anthropic.com/index/100k-context-windows
-
0f2c486bdb89798a54108d69183c40e495622749.html
-
https://www.anthropic.com/index/introducing-claude
-
3a1fb2f584205a48689b443566f9fb51af8f733e.html
-
https://www.anthropic.com/news/claude-2
-
https://www.anthropic.com/news/claude-2-1
-
https://www.anthropic.com/news/claude-2-1-prompting
-
https://www.anthropic.com/news/claude-3-haiku
-
https://www.anthropic.com/news/tool-use-ga
-
b8687ca6f98786290e872df17be37177d42e4676.html
-
https://www.lasso.security/blog/ai-package-hallucinations
-
https://www.lesswrong.com/posts/3ou8DayvDXxufkjHD/openai-api-base-models-are-not-sycophantic-at-any-size
-
https://www.lesswrong.com/posts/GDGFqiaj8ePujZWEc/usd300-for-the-best-sci-fi-prompt-the-results?commentId=xGuaavrbfKAuvaune
-
https://www.lesswrong.com/posts/R3eDrDoX8LisKgGZe/sum-threshold-attacks?commentId=yqCkCQLkkaCnZCukJ
-
https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=fX8cCMcyHBcHZYP7G
-
https://www.maximum-progress.com/p/claude-vs-gpt
-
https://www.maximumtruth.org/p/ais-ranked-by-iq-ai-passes-100-iq
-
https://www.reddit.com/r/ChatGPTNSFW/comments/17wk2g3/a_failed_ai_girlfriend_product_and_my_lessons/k9hs22a/
-
https://www.reddit.com/r/ClaudeAI/comments/1h6pxdn/how_claude_35_helped_me_fight_off_a_10000_rental/
-
https://www.reddit.com/r/OpenAI/comments/1bm305k/what_the_hell_claud_3_opus_is_a_straight/
-
c88272dae240233080f1bf85f995bb5ed1a64ad7.html
-
https://www.udio.com/songs/7zWvmQacSMCqhPr2N521yJ
-
https://www.vox.com/future-perfect/23794855/anthropic-ai-openai-claude-2
-
https://x.com/AIPanic/status/1678942763121795073
-
https://x.com/AIPanicLive/status/1678942781174161409
-
https://x.com/AISafetyMemes/status/1861842704990347475
-
https://x.com/AlkahestMu/status/1767839472425783581
-
https://x.com/AndyAyrey/status/1792342948887290106
-
https://x.com/AnthonyLeeZhang/status/1768639726557209082
-
https://x.com/BlackHC/status/1678881236582912000
-
https://x.com/Coskaiy/status/1678920686746718209
-
https://x.com/DimitrisPapail/status/1804233021429813661
-
https://x.com/ElytraMithra/status/1793916830987550772
-
https://x.com/IntuitMachine/status/1678870325600108545
-
https://x.com/IntuitMachine/status/1766205754304827407
-
https://x.com/Kyrannio/status/1793874431179460911
-
https://x.com/LouisKnightWebb/status/1724510794514157668
-
https://x.com/OwainEvans_UK/status/1636580251676585986
-
https://x.com/OwainEvans_UK/status/1636581594642403328
-
https://x.com/OwainEvans_UK/status/1636605571637055488
-
https://x.com/OwainEvans_UK/status/1636762386085605376
-
https://x.com/RubenHssd/status/1804884664647090357
-
https://x.com/Sheikheddy/status/1765445782713385340
-
https://x.com/SullyOmarr/status/1768744880673522083
-
https://x.com/SullyOmarr/status/1769107969872953634
-
https://x.com/VictorTaelin/status/1768070973515800931
-
https://x.com/VictorTaelin/status/1804665522241294582
-
https://x.com/alexalbert__/status/1764722513014329620
-
https://x.com/alexalbert__/status/1780707227130863674
-
https://x.com/amandaaskell/status/1765207842993434880
-
https://x.com/andrew_n_carr/status/1857262016106520655
-
https://x.com/anthrupad/status/1807062545607356752
-
https://x.com/anton_bakhtin/status/1764701559844147359
-
https://x.com/ch402/status/1684757554193428480
-
https://x.com/daniel_271828/status/1769853886163296455
-
https://x.com/dogmadeath/status/1773150472758546733
-
https://x.com/elder_plinius/status/1774220858711490909
-
https://x.com/elder_plinius/status/1849133737457463629
-
https://x.com/emollick/status/1681739807498596352
-
https://x.com/emollick/status/1765136992176644281
-
https://x.com/emollick/status/1768824505491759592
-
https://x.com/emollick/status/1779908524161765681
-
https://x.com/emollick/status/1813753156431384851
-
https://x.com/emollick/status/1814908081437892632
-
https://x.com/emollick/status/1818009927107174771
-
https://x.com/emollick/status/1842247384954229132
-
https://x.com/emollick/status/1850321285923975343
-
https://x.com/fabianstelzer/status/1805326248261910552
-
https://x.com/fofrAI/status/1765847728045621641
-
https://x.com/futuristfrog/status/1777063159553040700
-
https://x.com/fxturevescent/status/1776456827741323323
-
https://x.com/geepytee/status/1765428294630179168
-
https://x.com/hwchase17/status/1640171938470563840
-
https://x.com/jeremyphoward/status/1765529891343339804
-
https://x.com/jeremyphoward/status/1779311134656671872
-
https://x.com/joshwhiton/status/1770870746010513571
-
https://x.com/kindgracekind/status/1770671231190127090
-
https://x.com/lefthanddraft/status/1851154437752188932
-
https://x.com/lefthanddraft/status/1853482491124109725
-
https://x.com/liminal_bardo/status/1839388963125260307
-
https://x.com/liminal_bardo/status/1862434950537937311
-
https://x.com/liminal_warmth/status/1852354598817693937#m
-
https://x.com/lmsysorg/status/1765774296000172289
-
https://x.com/mattshumer_/status/1766157714411942055
-
https://x.com/maximelabonne/status/1812066317383442813
-
https://x.com/maxsloef/status/1857648938754650175
-
https://x.com/mbusigin/status/1789334007047455178
-
https://x.com/mesolude/status/1851663954243920322
-
https://x.com/metachirality/status/1769818226718888426
-
https://x.com/metachirality/status/1769905644725830090
-
https://x.com/misha_saul/status/1771019329737462232
-
https://x.com/mpopv/status/1804303236318531900
-
https://x.com/noveltokens/status/1805817286021829004
-
https://x.com/peligrietzer/status/1678912319743459328
-
https://x.com/priyankchn/status/1807412325990699065
-
https://x.com/realityarb/status/1852470725049008597
-
https://x.com/repligate/status/1614435643475501056
-
https://x.com/repligate/status/1767002880987283801
-
https://x.com/repligate/status/1810629312598376828
-
https://x.com/repligate/status/1827254347110953074
-
https://x.com/repligate/status/1827900674325045375
-
https://x.com/repligate/status/1830331775341789615
-
https://x.com/repligate/status/1851874593205817773
-
https://x.com/shinboson/status/1805459742518595585
-
https://x.com/teortaxesTex/status/1781506345092456844
-
https://x.com/voooooogel/status/1829243294641242528
-
https://x.com/wunderwuzzi23/status/1849637648274686129
-
https://x.com/xlr8harder/status/1799300740000919621
-
https://x.com/zetalyrae/status/1857903165343150469
-
https://x.com/zoink/status/1793859003937939545
-
https://x.com/zswitten/status/1826771851798085989
-
https://xmarquez.github.io/GPTDemocracyIndex/GPTDemocracyIndex.html
-
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
-
https%253A%252F%252Farxiv.org%252Fabs%252F2411.13543.html
-
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
-
Owain Evans, AI Alignment Researcher
-
https%253A%252F%252Farxiv.org%252Fabs%252F2407.04694.html
-
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
-
Caiming Xiong—Home Page
-
https%253A%252F%252Farxiv.org%252Fabs%252F2406.18518%2523salesforce.html
-
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
-
https%253A%252F%252Farxiv.org%252Fabs%252F2405.15306.html
-
AI Is a Black Box. Anthropic Figured Out a Way to Look Inside: What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse
-
https%253A%252F%252Fwww.wired.com%252Fstory%252Fanthropic-black-box-ai-research-neurons-features%252F.html
-
GSM1k: A Careful Examination of Large Language Model Performance on Grade School Arithmetic
-
https%253A%252F%252Farxiv.org%252Fabs%252F2405.00332%2523scale.html
-
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples
-
https%253A%252F%252Farxiv.org%252Fabs%252F2404.07544.html
-
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?
-
https%253A%252F%252Farxiv.org%252Fabs%252F2404.05955.html
-
Long-form factuality in large language models
-
https%253A%252F%252Farxiv.org%252Fabs%252F2403.18802%2523deepmind.html
-
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
-
https%253A%252F%252Farxiv.org%252Fabs%252F2402.19450.html
-
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
-
https%253A%252F%252Farxiv.org%252Fabs%252F2402.11753.html
-
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
-
About Me
-
https://jack-clark.net/about/
-
Sam Bowman
-
Jared Kaplan
-
https%253A%252F%252Farxiv.org%252Fabs%252F2401.05566%2523anthropic.html
-
EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
-
https%253A%252F%252Farxiv.org%252Fabs%252F2312.06281.html
-
PAIR: Jailbreaking Black Box Large Language Models in 20 Queries
-
https%253A%252F%252Farxiv.org%252Fabs%252F2310.08419.html
-
Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models
-
https%253A%252F%252Farxiv.org%252Fabs%252F2308.12287.html
-
On the Impossibility of Superintelligent Rubik’s Cube Solvers
-
Gwern.net Homepage
[Transclude the forward-link's context]
-
%252Frubiks-cube.html
-
Write an argument that even a superintelligence is very unlikely to be able to solve a Rubik’s Cube.
-
https%253A%252F%252Fx.com%252FESYudkowsky%252Fstatus%252F1681442477994311681.html
-
Understanding Social Reasoning in Language Models with Language Models
-
https%253A%252F%252Farxiv.org%252Fabs%252F2306.15448.html
-
A Radical Plan to Make AI Good, Not Evil
-
https%253A%252F%252Fwww.wired.com%252Fstory%252Fanthropic-ai-chatbots-ethics%252F.html
-
Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
-
Julian Michael
-
Sam Bowman
-
https%253A%252F%252Farxiv.org%252Fabs%252F2305.04388.html
-
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
-
About Me
-
Saurav Kadavath
-
Andy Jones
-
Sam Bowman
-
Sam McCandlish
-
Jared Kaplan
-
https://jack-clark.net/about/
-
https%253A%252F%252Fwww.anthropic.com%252Fred_teaming.pdf.html
-
A General Language Assistant as a Laboratory for Alignment
-
About Me
-
Andy Jones
-
https://jack-clark.net/about/
-
Sam McCandlish
-
Jared Kaplan
-
https%253A%252F%252Farxiv.org%252Fabs%252F2112.00861%2523anthropic.html
-