- See Also
-
Gwern
- “Gwern.net 404 Error Page”, Gwern 2012
- “Site Help”, Gwern 2024
- “Why So Few Matt Levines?”, Gwern 2024
- “Research Ideas”, Gwern 2017
- “Why To Not Write A Book”, Gwern 2024
- “Design Graveyard”, Gwern 2010
- “About This Website”, Gwern 2010
- “What Is an ‘AI Warning Shot’?”, Gwern 2024
- “Hardware Hedging Against Scaling Regime Shifts”, Gwern 2024
- “Number Search Engine via NN Embeddings”, Gwern 2024
-
Links
- “Funding Safe AGI”, Legg 2009
- “Looks and Longevity: Do Prettier People Live Longer?”, Sheehan & Hamermesh 2024
- “Revisiting the Relationship between Economic Freedom and Development to Account for Statistical Deception by Autocratic Regimes”, Alvarez et al 2024
- “Magika: AI-Powered Content-Type Detection”, Fratantonio et al 2024
- “Łukasz Kaiser”
- “Jakob Uszkoreit”
- “Using Static Websites for Tiny Archives”
- “Forebruary Perpetual Calendar”
- “Embodying Addiction: A Predictive Processing Account”
- “Cerner Real-World Data (CRWD): A De-Identified Multicenter Electronic Health Records Database”, Ehwerhemuepha et al 2022
- “Traveller Cover”, Workshop 1977
- “Using Grocery Data for Credit Decisions”, Lee et al 2024b
- “The Temperature of Heaven and Hell [Retrospective]”, Pérez 2001
- “Heaven Is Hotter Than Hell & A Refutation”, Simanek 2014
- “The Temperature of Heaven and Hell”, Foote 1920
- “The Association between Glucose-Dependent Insulinotropic Polypeptide And/or Glucagon-Like Peptide-1 Receptor Agonist Prescriptions and Substance-Related Outcomes in Patients With Opioid and Alcohol Use Disorders: A Real-World Data Analysis”, Qeadan et al 2024
- “Paul Darwin Foote (1888–1971) § The Temperature of Heaven & Hell”, Astin 1979 (page 12)
- “Theological Engineering Exam”, Anonymous 2024
- “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints”, Ainslie et al 2023
- “Language Model Alignment With Elastic Reset”, Noukhovitch et al 2023
- “Super(ficial)-Alignment: Strong Models May Deceive Weak Models in Weak-To-Strong Generalization”, Yang et al 2024
- “Gemma 2: Improving Open Language Models at a Practical Size”, Riviere et al 2024
- “Strategic Insights from Simulation Gaming of AI Race Dynamics”, Gruetzemacher et al 2024
- “Inference Scaling for Long-Context Retrieval Augmented Generation”, Yue et al 2024
- “MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering”, Chan et al 2024
- “The Rise of AI-Generated Content in Wikipedia”, Brooks et al 2024
- “CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation”, Xu et al 2024
- “SANA: Efficient High-Resolution Image Synthesis With Linear Diffusion Transformers”, Xie et al 2024
- “Thinking LLMs: General Instruction Following With Thought Generation”, Wu et al 2024
- “Always Measure One Level Deeper: Performance Measurements Often Go Wrong, Reporting Surface-Level Results That Are More Marketing Than Science”, Ousterhout 2018
- “Colin Palmer”
- “How Not to Bomb Your Offer Negotiation”
- “Google DeepMind’s Grandmaster-Level Chess Without Search”
- “Imprompter”
- “An Estimation of the Absolute Number of Axons Indicates That Human Cortical Areas Are Sparsely Connected”
- “The Best RPG Cover of All Time [Traveller 1977]”, Dwiz 2024
- “Industrious Dice [Minimizing Pip Counts on Still-Functional Dice]”
- “Tian Ge, PhD”
- “Furu Wei”
- “Thomas Wang”
- “Dan Rujescu”
- “Using Dictionary Learning Features As Classifiers”
- “Cats Are (almost) Liquid!—Cats Selectively Rely on Body Size Awareness When Negotiating Short Openings”, Pongrácz 2024
- “A Blog Post Is a Very Long and Complex Search Query to Find Fascinating People and Make Them Route Interesting Stuff to Your Inbox”, Karlsson 2024
- “The Early Days of Peer Review: 5 Insights from Historic Reports”
- “Robotic Microinjection Enables Large-Scale Transgenic Studies of Caenorhabditis Elegans”
- “Causal Effect of Video Gaming on Mental Well-Being in Japan 2020–2022”
- “Behnam Neyshabur”
- “Parachutes Made of Mucus Change How Some Scientists See the Ocean [Microbiome Harvesting?]”
- “Sperm Can’t Unlock an Egg Without This Ancient Molecular Key”
- “Abraham Lincoln and the First-Person Plural: A Study in Language and Leadership”, Field 2011
- “Do Rodents Smell With Sound?”, Mercado & Zhuo 2024
- “The Hydra Effect: Emergent Self-Repair in Language Model Computations”, McGrath et al 2023
- “Nemotron-4 340B Technical Report”, Adler et al 2024
- “Upcycling Large Language Models into Mixture of Experts”, He et al 2024
- “SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning”, Lee et al 2024
- “‘King of the Geeks’: How Alex Gerko Built a British Trading Titan: XTX Markets Conquered Foreign Exchange Trading and Made Its Russian-Born Founder a Multibillion-Pound Fortune”, Asgari 2024
- “The Baby Factory: Difficult Research Objects, Disciplinary Standards, and the Production of Statistical-Significance”, Peterson 2016
- “Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback”, Ivison et al 2024
- “Resolving Discrepancies in Compute-Optimal Scaling of Language Models”, Porian et al 2024
- “Tackling the Abstraction and Reasoning Corpus With Vision Transformers: the Importance of 2D Representation, Positions, and Objects”, Li et al 2024
- “FIMO: A Challenge Formal Dataset for Automated Theorem Proving”, Liu et al 2023
- “Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach”, Ma et al 2023
- “Teaching Large Language Models an Unseen Language on the Fly”, Zhang et al 2024
- “DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data”, Xin et al 2024
- “Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making”, Li et al 2024
- “Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-To-Image Synthesis”, Bai et al 2024
- “The Structure of the Token Space for Large Language Models”, Robinson et al 2024
- “Three-Dimension Animation Character Design Based on Probability Genetic Algorithm”, Gao 2024
- “NGPT: Normalized Transformer With Representation Learning on the Hypersphere”, Loshchilov et al 2024
- “The Great New England Vampire Panic: 200 Years After the Salem Witch Trials, Farmers Became Convinced That Their Relatives Were Returning from the Grave to Feed on the Living”, Tucker 2012
- “The Economic Way of Thinking in a Pandemic”, Tabarrok 2024
- “Language Encodes Geographical Information”, Louwerse & Zwaan 2009
- “Grounding the Ungrounded: Estimating Locations of Unknown Place Names from Linguistic Associations and Grounded Representations”, Recchia & Louwerse 2014
- “Testosterone Facilitates the Sense of Agency”, Westhuizen et al 2017
- “Evaluating the World Model Implicit in a Generative Model”, Vafa et al 2024
- “The Spontaneous Emergence of ‘A Sense of Beauty’ in Untrained Deep Neural Networks”, Shu et al 2024
- “Computer-Aided Colorization State-Of-The-Science: A Survey”, Cao et al 2024
- “Conservative Shift Among High-Exposure Survivors of the September 11th Terrorist Attacks”, Bonanno & Jost 2006
- “QuALITY: Question Answering With Long Input Texts, Yes!”, Pang et al 2021
- “Language Models Learn to Mislead Humans via RLHF”, Wen et al 2024
- “Silicon Valley, the New Lobbying Monster: From Crypto to AI, the Tech Sector Is Pouring Millions into Super PACS That Intimidate Politicians into Supporting Its Agenda”, Duhigg 2024
- Wikipedia
- Miscellaneous
- Bibliography
See Also
Gwern
“Gwern.net 404 Error Page”, Gwern 2012
“Site Help”, Gwern 2024
“Why So Few Matt Levines?”, Gwern 2024
“Research Ideas”, Gwern 2017
“Why To Not Write A Book”, Gwern 2024
“Design Graveyard”, Gwern 2010
“About This Website”, Gwern 2010
“What Is an ‘AI Warning Shot’?”, Gwern 2024
“Hardware Hedging Against Scaling Regime Shifts”, Gwern 2024
“Number Search Engine via NN Embeddings”, Gwern 2024
Links
“Funding Safe AGI”, Legg 2009
“Looks and Longevity: Do Prettier People Live Longer?”, Sheehan & Hamermesh 2024
“Revisiting the Relationship between Economic Freedom and Development to Account for Statistical Deception by Autocratic Regimes”, Alvarez et al 2024
“Magika: AI-Powered Content-Type Detection”, Fratantonio et al 2024
“Łukasz Kaiser”
“Jakob Uszkoreit”
“Using Static Websites for Tiny Archives”
“Forebruary Perpetual Calendar”
“Embodying Addiction: A Predictive Processing Account”
“Cerner Real-World Data (CRWD): A De-Identified Multicenter Electronic Health Records Database”, Ehwerhemuepha et al 2022
Cerner real-world data (CRWD): A de-identified multicenter electronic health records database
“Traveller Cover”, Workshop 1977
“Using Grocery Data for Credit Decisions”, Lee et al 2024b
“The Temperature of Heaven and Hell [Retrospective]”, Pérez 2001
The temperature of heaven and hell [retrospective]:
View PDF:
“Heaven Is Hotter Than Hell & A Refutation”, Simanek 2014
“The Temperature of Heaven and Hell”, Foote 1920
“The Association between Glucose-Dependent Insulinotropic Polypeptide And/or Glucagon-Like Peptide-1 Receptor Agonist Prescriptions and Substance-Related Outcomes in Patients With Opioid and Alcohol Use Disorders: A Real-World Data Analysis”, Qeadan et al 2024
“Paul Darwin Foote (1888–1971) § The Temperature of Heaven & Hell”, Astin 1979 (page 12)
Paul Darwin Foote (1888–1971) § The Temperature of Heaven & Hell
“Theological Engineering Exam”, Anonymous 2024
“GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints”, Ainslie et al 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
“Language Model Alignment With Elastic Reset”, Noukhovitch et al 2023
“Super(ficial)-Alignment: Strong Models May Deceive Weak Models in Weak-To-Strong Generalization”, Yang et al 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
“Gemma 2: Improving Open Language Models at a Practical Size”, Riviere et al 2024
“Strategic Insights from Simulation Gaming of AI Race Dynamics”, Gruetzemacher et al 2024
Strategic Insights from Simulation Gaming of AI Race Dynamics
“Inference Scaling for Long-Context Retrieval Augmented Generation”, Yue et al 2024
Inference Scaling for Long-Context Retrieval Augmented Generation
“MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering”, Chan et al 2024
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
“The Rise of AI-Generated Content in Wikipedia”, Brooks et al 2024
“CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation”, Xu et al 2024
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
“SANA: Efficient High-Resolution Image Synthesis With Linear Diffusion Transformers”, Xie et al 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
“Thinking LLMs: General Instruction Following With Thought Generation”, Wu et al 2024
Thinking LLMs: General Instruction Following with Thought Generation
“Always Measure One Level Deeper: Performance Measurements Often Go Wrong, Reporting Surface-Level Results That Are More Marketing Than Science”, Ousterhout 2018
View External Link:
https://cacm.acm.org/research/always-measure-one-level-deeper/
“Colin Palmer”
“How Not to Bomb Your Offer Negotiation”
“Google DeepMind’s Grandmaster-Level Chess Without Search”
“Imprompter”
“An Estimation of the Absolute Number of Axons Indicates That Human Cortical Areas Are Sparsely Connected”
“The Best RPG Cover of All Time [Traveller 1977]”, Dwiz 2024
“Industrious Dice [Minimizing Pip Counts on Still-Functional Dice]”
Industrious Dice [minimizing pip counts on still-functional dice]:
View External Link:
https://mathenchant.wordpress.com/2024/10/17/industrious-dice/
“Tian Ge, PhD”
“Furu Wei”
“Thomas Wang”
“Dan Rujescu”
“Using Dictionary Learning Features As Classifiers”
“Cats Are (almost) Liquid!—Cats Selectively Rely on Body Size Awareness When Negotiating Short Openings”, Pongrácz 2024
“A Blog Post Is a Very Long and Complex Search Query to Find Fascinating People and Make Them Route Interesting Stuff to Your Inbox”, Karlsson 2024
“The Early Days of Peer Review: 5 Insights from Historic Reports”
The early days of peer review: 5 insights from historic reports
“Robotic Microinjection Enables Large-Scale Transgenic Studies of Caenorhabditis Elegans”
Robotic microinjection enables large-scale transgenic studies of Caenorhabditis elegans
“Causal Effect of Video Gaming on Mental Well-Being in Japan 2020–2022”
Causal effect of video gaming on mental well-being in Japan 2020–2022
“Behnam Neyshabur”
“Parachutes Made of Mucus Change How Some Scientists See the Ocean [Microbiome Harvesting?]”
Parachutes Made of Mucus Change How Some Scientists See the Ocean [microbiome harvesting?]
“Sperm Can’t Unlock an Egg Without This Ancient Molecular Key”
Sperm Can’t Unlock an Egg Without This Ancient Molecular Key
“Abraham Lincoln and the First-Person Plural: A Study in Language and Leadership”, Field 2011
Abraham Lincoln and the First-Person Plural: A Study in Language and Leadership
“Do Rodents Smell With Sound?”, Mercado & Zhuo 2024
“The Hydra Effect: Emergent Self-Repair in Language Model Computations”, McGrath et al 2023
The Hydra Effect: Emergent Self-repair in Language Model Computations
“Nemotron-4 340B Technical Report”, Adler et al 2024
“Upcycling Large Language Models into Mixture of Experts”, He et al 2024
“SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning”, Lee et al 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
“‘King of the Geeks’: How Alex Gerko Built a British Trading Titan: XTX Markets Conquered Foreign Exchange Trading and Made Its Russian-Born Founder a Multibillion-Pound Fortune”, Asgari 2024
“The Baby Factory: Difficult Research Objects, Disciplinary Standards, and the Production of Statistical-Significance”, Peterson 2016
“Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback”, Ivison et al 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
“Resolving Discrepancies in Compute-Optimal Scaling of Language Models”, Porian et al 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
“Tackling the Abstraction and Reasoning Corpus With Vision Transformers: the Importance of 2D Representation, Positions, and Objects”, Li et al 2024
“FIMO: A Challenge Formal Dataset for Automated Theorem Proving”, Liu et al 2023
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
“Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach”, Ma et al 2023
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach
“Teaching Large Language Models an Unseen Language on the Fly”, Zhang et al 2024
Teaching Large Language Models an Unseen Language on the Fly
“DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data”, Xin et al 2024
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
“Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making”, Li et al 2024
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
“Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-To-Image Synthesis”, Bai et al 2024
“The Structure of the Token Space for Large Language Models”, Robinson et al 2024
“Three-Dimension Animation Character Design Based on Probability Genetic Algorithm”, Gao 2024
Three-Dimension Animation Character Design Based on Probability Genetic Algorithm
“NGPT: Normalized Transformer With Representation Learning on the Hypersphere”, Loshchilov et al 2024
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
“The Great New England Vampire Panic: 200 Years After the Salem Witch Trials, Farmers Became Convinced That Their Relatives Were Returning from the Grave to Feed on the Living”, Tucker 2012
“The Economic Way of Thinking in a Pandemic”, Tabarrok 2024
“Language Encodes Geographical Information”, Louwerse & Zwaan 2009
“Grounding the Ungrounded: Estimating Locations of Unknown Place Names from Linguistic Associations and Grounded Representations”, Recchia & Louwerse 2014
“Testosterone Facilitates the Sense of Agency”, Westhuizen et al 2017
“Evaluating the World Model Implicit in a Generative Model”, Vafa et al 2024
“The Spontaneous Emergence of ‘A Sense of Beauty’ in Untrained Deep Neural Networks”, Shu et al 2024
The spontaneous emergence of ‘a sense of beauty’ in untrained deep neural networks
“Computer-Aided Colorization State-Of-The-Science: A Survey”, Cao et al 2024
“Conservative Shift Among High-Exposure Survivors of the September 11th Terrorist Attacks”, Bonanno & Jost 2006
Conservative Shift Among High-Exposure Survivors of the September 11th Terrorist Attacks
“QuALITY: Question Answering With Long Input Texts, Yes!”, Pang et al 2021
“Language Models Learn to Mislead Humans via RLHF”, Wen et al 2024
“Silicon Valley, the New Lobbying Monster: From Crypto to AI, the Tech Sector Is Pouring Millions into Super PACS That Intimidate Politicians into Supporting Its Agenda”, Duhigg 2024
Wikipedia
Miscellaneous
Bibliography
-
1979-astin.pdf#page=12
: “Paul Darwin Foote (1888–1971) § The Temperature of Heaven & Hell”, -
https://arxiv.org/abs/2312.07551
: “Language Model Alignment With Elastic Reset”, -
https://arxiv.org/abs/2408.00118#google
: “Gemma 2: Improving Open Language Models at a Practical Size”, -
https://arxiv.org/abs/2410.10629#nvidia
: “SANA: Efficient High-Resolution Image Synthesis With Linear Diffusion Transformers”, -
https://arxiv.org/abs/2406.09279
: “Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback”, -
https://arxiv.org/abs/2406.19146
: “Resolving Discrepancies in Compute-Optimal Scaling of Language Models”, -
https://arxiv.org/abs/2410.06405
: “Tackling the Abstraction and Reasoning Corpus With Vision Transformers: the Importance of 2D Representation, Positions, and Objects”, -
https://arxiv.org/abs/2410.08993
: “The Structure of the Token Space for Large Language Models”, -
2024-tabarrok.pdf
: “The Economic Way of Thinking in a Pandemic”, -
2014-recchia.pdf
: “Grounding the Ungrounded: Estimating Locations of Unknown Place Names from Linguistic Associations and Grounded Representations”, -
https://arxiv.org/abs/2406.03689
: “Evaluating the World Model Implicit in a Generative Model”, -
https://www.newyorker.com/magazine/2024/10/14/silicon-valley-the-new-lobbying-monster
: “Silicon Valley, the New Lobbying Monster: From Crypto to AI, the Tech Sector Is Pouring Millions into Super PACS That Intimidate Politicians into Supporting Its Agenda”,