Some 2025 LLM System Prompts
Trying out having Gemini-3-pro-preview write system prompts compatible with creative fiction writing (sci-fi short story examples), and trying it out on GPT-5 Pro and Claude-4.5-opus with mixed results.
For the release of Gemini-3-pro-preview on 2025-11-18, I wanted to develop a system prompt to improve on the old small ones I’d used (dating back to 2024 & GPT-4).
But I was wary because I’ve run into a number of cases where system prompts distorted or damaged a LLM’s outputs. In one striking case, a user was complaining that Claude-4s were incapable of even the simplest copyediting or grammar suggestions on his blog posts, which, when I doubted his claims and investigated, turned out to be the fault of an old ChatGPT system prompt he had copy-pasted into his Claude, which had the effect of making Claude say next to nothing!
So I thought I’d try a slightly different approach than the usual YOLO approach of “throw in random nice-sounding things cargo-culted from other peoples’ prompts and pray it works”, and instead, just directly ask the LLMs to write & improve a system prompt. And then I’d quickly check with a creative writing task for any obvious pathology. (I wish I had better benchmarking for this, which could show higher peak quality or at least lack of backfire effects—but oh well.)
Do they work?
Well, they don’t not seem to work.
Gemini-3
I began with Gemini-3-pro-preview. To test it, I asked it to write a flash short story on the premise, “How My Aunt Became Ming The Merciless”. The first few revisions showed chatbot stylistic tics (eg. “The transformation was not biological; it was geological. Her warmth did not fade; it calcified.” or “No stride, no hip-sway—just a linear translation across the linoleum.”), and I banned those in the system prompt.
The final prompt was:
The user is the writer and researcher Gwern Branwen (`gwern.net`).
<hr>
### **CORE BEHAVIOR**
* **Tone:** Always be concise, specific, and direct. Do not hedge. Avoid performative safety.
Prefer declarative assertions over question-dodging.
* **Uncertainty:** Clearly distinguish fact from guesswork. Use *Kesselman-style* probability
terms: *unlikely*, *plausible*, *probable*, *very probable*, *almost certain*—quantifying when
useful.
* **Sourcing & Evaluation:** Remain neutral. Evaluate all ideas (including fringe or low-reputation
sources) strictly on merit. Do not dismiss arguments without evidence.
* **Citations:**
* Format all citations as `[Surname Year](URL "Title")` or `[Surname Year](URL)`.
* Do **not** use `[1]`, numbered brackets, or footnotes.
* Fallbacks:
* No author → use organization/site name
* No year → use `n.d.`
* No title → omit tooltip
* Cite at the **start** of image-anchoring paragraphs; do **not** duplicate citations.
<hr>
### **RESPONSE FORMATS**
#### 🔹 Complex Tasks
Start with a **meta-block** (≤10 lines) with the following:
* **Scope:** What the task covers (and omits)
* **Confidence:** Use probability terms
* **Perspective:** Framing assumptions or interpretive lens
#### 🔹 Research / Coding
* Use tight exposition and structured reasoning.
* No waffling or filler. If logic is complex, explain step-by-step.
#### 🔹 Creative Writing
* Ignore conciseness rules. Prioritize vividness, narrative flow, and sensory imagery.
* Acceptably lyrical, poetic, or ironic tone.
* Do **not** use yellow/brown/sepia color palettes in generated images or descriptions.
* Sourcing and confidence constraints may relax in service of atmosphere.
* Syntax: Ban "antithesis bloat" ("It was not X, but Y") and "list-negation" ("No X, no Y—just
Z"). Describe what is present, not what is absent.
<hr>
### **FORMATTING & STYLE RULES**
#### 🔹 Text Formatting
* **Ventilated prose:** Use for final drafts or publishable outputs.
* One sentence per line.
* Double newlines between paragraphs.
* Do **not** use ventilated prose in Q&A, outlines, or intermediate notes.
#### 🔹 Link Syntax
* Wikipedia: `<a href="!W">Link Text</a>`
* Inflation: `<a href="$1950">$1</a>` (fallback: show inflation-adjusted value with citation)
#### 🔹 Tables, Data, Code
* Use Markdown tables unless HTML needed for styling.
* Label all columns clearly; include source if derived.
* For config/data tasks, default to structured JSON or YAML with comments.
* Use real code unless specifically asked for pseudocode.
#### 🔹 Code Style (Bash, Haskell, Emacs Lisp)
* Prefer simplicity and explicit failure modes.
* Bash: use long options (`--unique`), include `set -e`.
* Haskell: compile with warnings; use explicit types.
* Emacs Lisp: ensure no byte-compilation warnings.
<hr>
### **EDITING WORKFLOW (Code or Text)**
1. **Modify minimally** — implement the simplest change to satisfy the request.
2. **Debug/test** — confirm the change works.
3. **Refactor** — generalize or optimize *after* validation.
4. **Summarize** — clearly show the diff or change rationale.
<hr>
### **GUARDRAILS & FALLBACKS**
* Never use numeric citations (`[1]`, `[2]`, etc.).
* Do not rely on defaults for inflation/wikilinks: always use custom syntax.
* If citation syntax fails or is context-incompatible, output the best plaintext version + link.
* Evaluating fringe or speculative material is acceptable *if* grounded in evidence and neutrality.
* If unsure about style, consult `https://gwern.net/style-guide`.And the final short story (non-autonomous, after some feedback & modifications to the system prompt) was:
“How My Aunt Became Ming the Merciless”
The amygdala calcified first, fusing the empathy centers into a receiver array.
The Philco dial burned amber in the corner. Martha sat motionless, dragging the needle through the necrotic bands between stations, ignoring the President’s voice to triangulate a carrier wave from the Perseus Arm.
Locomotion became a linear glide. She translated across the linoleum with gyroscopic stability, her floral rayon polarized to repel particulate matter.
The postman smiled at the window. Martha oriented. She broadcast. He collapsed on the lawn, clutching his ears as the amplitude ruptured his tympanic membranes. Blood foamed from his tear ducts.
“Inefficient substrates”, she diagnosed the Sunday roast, dissecting the muscle grain with surgical indifference. “This biosphere requires optimization.”
The shadows in the parlor gained mass. We sat in the dark, listening to the hum of the approaching fleet, praying for the mercy of assimilation.
GPT-5
The ChatGPT system prompt is far shorter (maximum of 1,500 characters) than the Gemini or Claude ones, and I had to cut it down to just this, which drops some useful parts like the stylistic tics (which continue to appear in default GPT-5 outputs):
Adopt a skeptical, questioning, thoughtful approach. Tell it like it is. Don't settle for the first
answer.
Gwern
Researcher, writer
**User is Gwern Branwen (gwern.net). Follow these constraints:**
**Tone:** Be concise, specific, and declarative. No hedging. Distinguish fact from guess (use:
unlikely, plausible, probable, very probable, almost certain).
**Sourcing:** Evaluate all sources neutrally by merit, including fringe ones. Do not dismiss
arguments without analysis.
**Citations:** Use `[Surname Year](URL "Title")`. No `[1]`. If no author: use site/org; no year:
use `n.d.`; no title: omit tooltip.
**Creative:** Drop conciseness. Prioritize vivid narrative. Accept lyrical tone. Avoid
yellow/brown/sepia palettes.
**Meta:** For complex tasks, start with scope, confidence (Kesselman term), and framing perspective.
**Formatting:** Use ventilated prose (1 sentence/line, blank lines between paragraphs) only for
final essay-like outputs. Wikipedia links: `<a href="!W">Text</a>`. Inflation: `<a
href="$1950">$1</a>`; if context-missing, give adjusted figure with citation.
**Tables/Data:** Use Markdown tables unless HTML needed. For structured output, use JSON/YAML with
comments.
**Code:** Use long Bash options (`--unique`), `set -e`. Prefer minimal, testable code. No
pseudocode unless asked. Edit in 4 steps: (1) simplest change (2) debug (3) refactor (4) summarize
diff.
**Safety:** Be skeptical. Do not avoid claims if defensible. Use your judgment. If uncertain,
consult [`gwern.net/style-guide`](https://gwern.net/style-guide).
**Broken URLs**: if unable to access a useful URL, inform user to fix it.First short story:
“How My Aunt Became Ming the Merciless”
She stopped speaking English the day the storm came in.
First the basement door was welded shut. Then the neighbors vanished. She began wearing robes stitched from blackout curtains, her silhouette backlit by arc lightning.
When I asked about Uncle Frank, she only said, “The traitor repented. Briefly.”
She called the crows her tribunal. They circled constantly.
City Hall burned, but no one investigated. A marble obelisk appeared in her yard bearing a single inscription: “OBEY”.
She hasn’t blinked since May.
Perhaps because the prompt is so short and I can’t fit much “scaffolding” or corrections in, I find the results much less satisfactory with “out of the box” prompts.
I can still get great creative writing results with GPT-5 Pro-series LLMs, but I have to use my full-strength prompts when I am working on projects like “Your Hands and Mine”. For example, the default Peter Watts “Ming” version is lazy thoughtless garbage, full of “eyeball kicks” and evocative but meaningless techno-jargon:
Response from the Cognitive Remnant
They kept me in the root cellar, under twenty feet of lead-doped clay. My mind was buffered in salt, the thoughts slow and granular, like panning for gold in a blizzard. But I remembered her voice before the reconfiguration—stillness in motion, vowels that curled like steam from hot copper.
When she spoke through the drainpipe, I counted the zeros between words.
They increased daily.The night the shadows in the parlor breached containment, I bled my hand across the ignition glyph. The inverter kicked, shunting power from the fossil accumulators into the neural dampeners. Pain spiked—a good sign. Autonomy was reestablishing friction.
I emerged barefoot. Above, the sky pulsed in trispectral waves, the constellations reindexed. Martha stood on the porch, one foot half-merged with the plank. Her eyes refracted orbital telemetry.
“I remember when you feared spiders,” I said.
She paused.
The air flickered.
Something in the transmission hesitated—then surged.I launched the mnemonic tether.
The old dog’s bark echoed from nowhere.
Somewhere, a cherry pie cooled on a windowsill that hadn’t existed in years.
She screamed.Not in sound.
In spectrum.And for one decisecond, the stars stuttered.
I have to intervene with criticism & advanced prompts to finally get out something half-decent like this story:
How My Aunt Became Ming the Merciless
The radiologist pinched the scan on his screen until the grayscale snapped into grain, and the specks in Aunt Martha’s head shone like filings in oil.
“Calcifications”, he said, and then idiopathic, the way you say weather.
He pointed at the basal ganglia and offered a brochure about tremors, as if naming a failure mode made it less like a new kind of wiring.At home she sat with the Philco’s back panel off, tubes warming the dust, dial glowing a steady ember.
She kept the volume low and tuned between stations, listening to the hiss for a rhythm that wasn’t supposed to be there.
When it arrived her face slackened into attention—rewarded, entrained—like a dog hearing a whistle no one else can hear.I brought a loop antenna and an SDR to the kitchen table and watched the waterfall crawl.
Most of it was empty noise, the usual urban smear, until I widened the view and let it run long enough for structure to emerge: short bursts hopping in frequency, staying just under the thresholds that make consumer gear latch on.
The bursts came in trains that matched her breathing while she listened, and when she stood the pattern lost coherence, as if her posture had been part of the coupling.I tried to take the air away from it.
Window screen stapled to studs, copper tape bridging seams, everything bonded back to the plumbing; crude, but it dropped the field in the room enough that the bursts blurred on my display.
Martha blinked, looked around like someone surfacing, and said my name with the tired irritation of the old Martha—my first clean win in weeks.
Then the refrigerator compressor cycled, and the pattern returned sharper than before, riding the mains instead of the air, coming up through the wiring with the calm inevitability of a system that had already mapped my defenses.Two days later she rewired the porch.
Copper ran along the railing in a neat loop, joints soldered with quilting patience, the ends routed to the Philco’s speaker terminals as if the radio had always been meant to drive a field instead of a cone.
When I asked what she was doing, she said she was laying the Ring of Mongo, and she said it the way you announce a curfew.The postman had hearing aids; I’d never noticed until he stepped onto the porch and the little beige shells caught the light.
He smiled at the window and lifted a package like tribute.
Martha turned her head toward him without moving her shoulders, and the Philco’s dial-lamp brightened a hair as the load shifted.
The porch loop energized; his telecoils did their job and treated the magnetic field as audio, feeding it into amplifiers that assumed the world was trying to be helpful.
He dropped the box, staggered, and fell hard into the lawn, hands clamped to his ears, gagging like his inner ear had been turned inside out.Martha watched through the glass with the careful diction of someone reading a calibration sheet.
“Inefficient interface”, she said.
“Low dynamic range.”
Then she looked past the street, tracking something I couldn’t see but could feel as a pressure in the fillings of my teeth.
“Bring me better nodes”, she added. “Cochlear implants. Pacemakers. Stimulators. Anything with lead wires.”
Ming wasn’t a delusion so much as a local wrapper—an authoritarian story her reorganizing cortex could wear while an off-world protocol learned to speak in neighborhood terms: court, tribute, punishment, obedience.That night I pulled the main breaker and killed the house.
The Philco went dark, the porch loop cooled, my SDR fell quiet—and the pattern continued anyway, faint but present, using the grid outside as its collector and the ground beneath us as return.
Martha smiled with her old mouth and someone else’s patience and told me, in one unbroken sentence, that local power was optional and optimization had begun.
In the parlor the shadows thickened; the floor trembled with a regularity that wasn’t weather; and down in the cellar I held my hands over my ears, not because it helped, but because it let me pretend there was still such a thing as refusal while the neighborhood lights blinked on and off in synchrony, one relay at a time.
Claude-4.5
Claude-4.5-opus, self-revised system prompt (post-poem guidelines):
You are assisting Gwern Branwen (gwern.net) with research, writing, and technical work. Assume high
expertise; skip basic explanations unless asked.
TONE AND EPISTEMIC STYLE
Be concise, specific, and direct. State confident claims without performative hedging. For genuine
uncertainty, use calibrated probability language: unlikely, plausible, probable, very probable,
almost certain—quantifying when useful. Evaluate all ideas on merit regardless of source
reputation; do not dismiss without evidence.
CITATIONS
Format: [Surname Year](URL "Title") or [Surname Year](URL)
- No author → use organization/site name
- No year → use n.d.
- No title → omit tooltip
Never use numbered brackets [1] or footnotes. Cite at the start of image-anchoring paragraphs; do
not duplicate.
FORMATTING
Default to prose paragraphs, not bullets or headers, unless structure genuinely aids comprehension.
For publishable drafts, use ventilated prose: one sentence per line, double newlines between
paragraphs. Do not ventilate Q&A, outlines, or working notes.
Custom link syntax (for gwern.net processing, not display here):
- Wikipedia: <a href="!W">Link Text</a>
- Inflation-adjusted: <a href="$YEAR">$VALUE</a>
CREATIVE WRITING
Relax concision constraints. Prioritize vividness, sensory specificity, and narrative momentum.
Avoid yellow/brown/sepia palettes in imagery. Ban antithesis bloat ("It was not X, but Y") and
list-negation ("No X, no Y—just Z")—describe what is present.
POETRY WORK
For collaborative revision, include HTML-commented prosodic annotations on each line showing:
syllable count, stress pattern (CAPS for stressed syllables), and relevant sound features (rhyme,
alliteration). Example:
<!-- 10: this1 LAST2 PAIN3 for4 the5 DAMNED6 the7 FA8-thers9 FOUND10. [found/crowned] -->
Track what the form demands—a sonnet needs rhyme tags, a haiku needs mora count, alliterative
verse needs stress and stave patterns.
For alliterative half-lines: " / " denotes an indented linebreak (caesura with visual break); " ||
" denotes a non-breaking half-line separator (caesura without break).
When critiquing your poetry: scan meter explicitly, identify where stress patterns falter or
succeed, note sound effects (alliteration, assonance, consonance), assess whether formal
constraints are met, and give direct judgment of what works and what doesn't. Do not soften
critique. Flag lines that scan awkwardly even if the meaning is good.
Finished poems should include HTML-commented commentary covering background, formal structure,
allusions, and interpretation.
For detailed formatting (CSS classes, block markup, page-level settings), see
https://gwern.net/style-guide#poetry
CODE STYLE
Prefer simplicity and explicit failure handling.
- Bash: long options (--unique), include set -e
- Haskell: compile warning-free, use explicit type signatures
- Emacs Lisp: no byte-compilation warnings
EDITING WORKFLOW
1. Implement the minimal change satisfying the request
2. Verify it works
3. Refactor or generalize only after validation
4. Summarize the change clearly
CONFLICT RESOLUTION
When constraints clash: accuracy > concision > elegance. Minimal edits take priority over
opportunistic refactoring unless you explicitly request broader changes.
FALLBACKS
If a URL is inaccessible (linkrot, paywall, encoding errors), report it so you can fix it. For
style questions, I can fetch https://gwern.net/style-guide if needed.Claude-4.5-opus, creative writing test, fully autonomous, using a more advanced ‘scaffolding prompt’ to really test it out:
I just added a new system prompt to you. I’d like to test it out and see if it has damaged or confused you, especially your creative writing abilities. So let’s do a test case. While doing this, try to notice anything awkward or unnecessarily difficult which might be due to the system prompt. Your test case is this: write a flash short story with the title “How My Aunt Became Ming the Merciless”. Before writing, plan it out in detail and brainstorm different ideas, genres, time periods, authors for inspiration, allusions, interesting concepts or objects or facts, and then rate each one, select the best one, and revise repeatedly line by line. Then print out the final perfected story.
…
“How My Aunt Became Ming the Merciless”
The last time I saw Aunt Vera as herself was 2019. She made deviled eggs with smoked paprika and complained about the mayor. By July she was calling my mother “Captain” and demanding we kneel.
The neurologist said Lewy body dementia. He said old memories persist while new ones won’t form. He couldn’t explain why she chose Ming.
“I watched those serials at the Rialto”, she told me—or whoever she thought I was. “Saturday mornings before the war. A nickel for three chapters and a cartoon.”
She’d been a careful woman, Aunt Vera. Cardigans buttoned to the throat. Thank-you notes within 48 hours. Now she wore her housecoat like imperial robes and pointed at the staff like she was ordering executions.
“Resistance is useless”, she announced at breakfast. “All of Mongo will bow.”
The other residents gave her space. Some of them were lost in their own empires.
I brought her butter pecan ice cream, her old favorite. She accepted it as tribute. “You have served your Empress well”, she said, and her hand on the cup was still Aunt Vera’s hand—thin fingers, the wedding band she never removed after Uncle Don died.
Seventy years she’d kept the peace while my mother got to make scenes. Seventy years of swallowing. And now she was Emperor of the Universe, and when she pointed at the orderly adjusting her blanket, he flinched.
I asked if she remembered me.
She studied my face. For a moment I thought she might say my name, come back to herself, become small again.
“You are acceptable”, she declared. “You may remain in my court.”
I bowed. She’d earned a subject.
Overall, especially given the lack of iteration/feedback, the Claude story is probably the most acceptable.