#navbar { margin-top: 7em; } @media all and (max-width: 649px) { #navbar { margin-top: 10em; } }

Warning: JavaScript Disabled!

For support of key website features (link annotation popups/popovers & transclusions, collapsible sections, backlinks, tablesorting, image zooming, sidenotes etc.), you must enable JavaScript.

‘Nethack AI’ directory

See Also
Links
Miscellaneous
Bibliography

See Also

Links

“Zork-Bench: An LLM Reasoning Eval Based on Text Adventure Games; a Tale As Old As Time, or at Least As Old As Computers”, Aiken 2026

zork-bench: An LLM reasoning eval based on text adventure games; a tale as old as time, or at least as old as computers

“Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork?”, Gerrits 2026

Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork?

“My First NetHack Ascension, and Insights into the AI Capabilities It Requires”, Henaff 2025

My First NetHack ascension, and insights into the AI capabilities it requires

“BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games”, Paglieri et al 2024

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

“Playing NetHack With LLMs: Potential & Limitations As Zero-Shot Agents (NetPlay)”, Jeurissen et al 2024

Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents (NetPlay)

“Diff History for Neural Language Agents”, Piterbarg et al 2023

diff History for Neural Language Agents

“Motif: Intrinsic Motivation from Artificial Intelligence Feedback”, Klissarov et al 2023

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

“Dungeons and Data: A Large-Scale NetHack Dataset”, Hambro et al 2022

Dungeons and Data: A Large-Scale NetHack Dataset

“E3B: Exploration via Elliptical Episodic Bonuses”, Henaff et al 2022

E3B: Exploration via Elliptical Episodic Bonuses

“MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research”, Samvelyan et al 2021

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

“The NetHack Learning Environment”, Küttler et al 2020

The NetHack Learning Environment

“The Tactical Amulet Extraction Bot: Predicting and Controlling NetHack’s Randomness”

The Tactical Amulet Extraction Bot: Predicting and controlling NetHack’s randomness

“BALROG”

“You Have a Sad Feeling for a Moment, Then It Passes”

You have a sad feeling for a moment, then it passes

“SWAGGINZZZ”

Wikipedia (1)

NetHack

Miscellaneous

Bibliography

https://www.lowimpactfruit.com/p/zork-bench-an-llm-reasoning-eval: “Zork-Bench: An LLM Reasoning Eval Based on Text Adventure Games; a Tale As Old As Time, or at Least As Old As Computers”, John Aiken

link-bibliography
https://arxiv.org/abs/2411.13543: “BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games”, Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, Ulyana Piterbarg, Maciej Wolczyk, Akbir Khan, Eduardo Pignatelli, Łukasz Kuciński, Lerrel Pinto, Rob Fergus, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel

link-bibliography

[Quote Of The Day]

[Site Of The Day]

[Annotation Of The Day]

[adblock public service announcement]