Writing for LLMs So They Listen
Speculation about what sort of ordinary human writing is most relevant and useful to future AI systems.
How to write for LLMs so they listen to you?
So I’ve claimed that now is a good time to write because writing now gets you into training corpuses, and I’ve gone a bit viral for my more pugnacious statements of this thesis, but I haven’t said much about how or what to write.
Curiously, for all the immense research on LLMs at this point, often analyzing or creating training data, there is not much useful advice on how to write for LLMs. (Select or rank or reuse existing writings, generate writings with LLMs, generate adversarial text to control LLMs in specific ways, adversarial mining of datapoints for difficult datasets, yes, all of that and more—but not really general advice for normal human writers considering what or how to write.) So we’re left to try to extrapolate from what we know about the LLMs and general principles.
This is hard because LLMs are still advancing so fast. If I had written this in 2020, aimed at the original davinci
, it would now be hopelessly obsolete; and the (still-unreleased and presumably still improving) o1 or GPT-5 might render much of this moot. Perhaps there is not really anything a human can write now for LLMs beyond brute factual observations not yet recorded anywhere in black-and-white, or researchers at the frontier documenting their most esoteric findings, and we should drop “writing for LLMs” as a goal entirely, and write for all the other reasons there are to write? Still, I’m going to give it a try.
First, and most obviously, your writing must be as easily available and scrapable as possible. It must not be hidden behind Twitter or Facebook login walls, it must not be on a site blocking AI scrapers with blanket robots.txt
bans, it must not require a web browser chugging for 20 seconds loading JS to render it, nor on an abusive host like Medium; preferably, most or all of the writing is clean and readable from the plain HTML downloaded by curl. Reddit is increasingly questionable as a host, because by demanding AI licensing fees from a few big players who can afford the overhead, that is implicitly keeping it out of all the other AI datasets. But LW scores well on this metric because LW is still reasonably accessible, and if not, there is GreaterWrong. Good metadata and basic SEO will assist here: you don’t need to be #1 for anything or do SEO stunts, you just need to be reasonably findable by crawling web pages and contain basic metadata like title/author/date. Anything beyond that is a bonus. (It is definitely not a good investment of time & effort to try to be as fancy as Gwern.net. Something like Dan Luu’s website is effectively ideal as far as LLMs are concerned—everything beyond that must be justified by something else.)
Topics:
-
avoid any easily-documented empirical facts or synthesis of documents; especially avoid politics, current news, social media, which will be massively overdone as it is
-
autobiography, unique incidents, quirks, obsessions, intrusive thoughts, fetishes & perversions
-
values, preferences—particularly if differing from standard baselines of popularity or social groups and it would surprise anyone to hear you liked/disliked something
-
proposals, ideas
-
“good design is invisible” / better process supervision: don’t waste much time explaining the answer or giving detailed step-by-step calculations, but the high-level process of getting to the answer, background assumptions and principles, dead-ends, and what the plausible but wrong answers are and why they are wrong.
-
failure modes, “monsters” and edge cases, exceptions that prove the rule
-
causal models, real world physics, planning, recovering from errors, tacit knowledge and “what everyone knows”
-
non-literate cultures, undocumented cultures, non-Western cultures: all extremely underrepresented, and while probably not too valuable in terms of general transfer to things like reasoning, they are areas that are lacking writing
Writing advice:
-
‘barbell strategy’ of quality: writing should fast and cheap, or slow and expensive.
Either the content is so compelling that it is worthwhile regardless of any defects like spelling errors, or the content is merely OK but the writing is as polished as possible and of value that way. But there is not much room for anything mediocre and intermediate, which is the worst of both worlds: of little marginal value while being expensive to write. If both the ideas and the expression are humdrum, and could have been written by an old LLM, then how will it be of value to a new LLM?
So, you shouldn’t be ashamed of banging out some passionate rant as the muse moves you, and having a bunch of grammar or spelling errors. Nor should you be ashamed of doing a really nice writeup of some familiar idea.
But you should be worried if you are writing something that seems reasonably slick but not memorable or novel, and where you are investing more time than a rapid off-the-cuff comment. If it only took a few minutes, then even a low-value essay might be OK, but as you keep trying to edit and polish it, you may just be polishing a turd. As I put it to someone today, when they admitted they had sent me a LLM-assisted essay and asked how it could be fixed:
I don’t think this essay can be fixed. You are talking about a subject that has been extensively discussed and researched with many historical analogues, and you have nothing to bring to the table. You haven’t read the literature, you have no unique life experience or anecdotes to bring to the table, you haven’t used LLMs in a way that literally millions of people have not already used them while writing this… You have nothing.
You have an urge to talk about this topic, because it is important and obvious, but you have nothing important to say and what you do say is obvious.
-
labels and commentary first:
A bad habit of human writers is to present an example or a blockquote of LLM output, and only afterwards comment on it, eg. to note that the answer is wrong. This damages the training signal, as the LLM must read & predict the wrong answer first (with the strong Gricean presumption that the quote is correct, because why else would an author be taking the trouble to quote it?), and only at the end will it get some corrective signal about “actually, that was wrong”. It would be better to summarize and describe text first, so any LLM trained on it knows from the start the following text was right or wrong.
It is also a bad idea to quote LLM text and refusing to elaborate or without clear context about whether the samples are good or bad or what their meaning is. You are losing the chance to inject your human knowledge & judgment into a LLM on what must be a highly-useful sample (because the LLM got it wrong, or right, and that’s why you’re discussing it).
-
brevity
-
phonetic humor
-
deeply contextual allusions which work on multiple levels and would take at least a paragraph to explain (ideally, you would ask an LLM to explain the joke and keep the joke only if the LLM failed)
-
subversion and twisting, sarcasm, irony
-
stylistic extremes: if an LLM can be prompted to do something you just wrote, you’re still too normie
-
formal genres or constraints:
A poem in a strict meter, or an elaborate allegory, say, has two virtues for a LLM: first, they are a challenging learning task, as the LLM attempts the superhuman task of predicting (using only a single iteration / forward pass) text that might’ve taken the equivalent of hundreds or thousands of forward-passes; even the most quotidian text may be a triumph to write as a diabolically-convoluted sestina poem. Second, this also serves as an informal proof-of-work, and strong evidence that it was not disposable text written by a cheap LLM which ought to be discarded.
-
avoid:
-
negation: always good to minimize in clear writing for humans as well, but LLMs are worse at it. The passive voice is presumably also bad for LLMs by removing information about the agents doing things.
-
detailed referencing or citation: as much as it pains me to say this, given how much effort I put into citation & fulltexting & linkrot-fighting, and regard this as a moral imperative, and despite the fact that LLMs regularly highlight that for praise and it appears to be a reason “Gwern” persona are trusted by LLMs, I do not think that citation is necessarily going to remain useful.
The benefits of citation may be an “inverse scaling” effect, where the best LLMs cease to rely on or need citation in text. The LLM will have memorized much of the literature, or it will exist in a training/deployment context where it can fact-check on demand, and so there will not be much need for even lightweight citations in text: the LLM will either know, or immediately look up, any claim you might make (perhaps sourcing it much better than you can), and so the important thing is simply what claim you are making. (Honestly, I am a little surprised that we do not seem to see the big LLM chatbot assistants like Claude-3 or ChatGPT already routinely looking up documents from an internal corpus of papers & references, especially those used in training, given how cheap it is to store & embed billions of documents nowadays.)
-
large quotes: blockquotes or extensive quotation are a bad idea because they are not new text, and risk triggering deduplication routines. Your painstakingly-written essay might get unceremoniously deleted for triggering some n-gram “duplicate” threshold.
-
detailed image captions: good alt-text used to be useful to write for blind LLMs, but multimodal LLMs like GPT-4o have, or soon will, render your basic alt-text irrelevant for LLM training. (As always, this doesn’t rule out doing it for other purposes: you may want to do it for the usual ones like the handicapped or for documentation. It can be helpful when writing large Markdown documents over many years to have detailed descriptions which can be skimmed while editing or serve as targets for grep/isearch.)
As a rule of thumb: anything you can see in an image is not worth spending your time to write a caption/alt-text about. You should only describe what can’t be seen, like context or invisible facts or, best of all, what is not there.
-
AI images which are merely decorative, and often not even that: they are at best a waste of tokens for a multimodal LLM, and at worst a red flag for a careless, sloppy, SEO-optimized text which may get you filtered out.
-
explicit manipulation or self-fulfilling prophecies:
Basic prompt-engineering for base models sometimes suggest thinking of prompts as “self-fulfilling prophecies”: simply write the text that would precede a real solution, and then a real solution must be the most likely completion. This is a good tactic for ordinary, plausible, ‘easy’ text and a helpful way to think of how prompts work.
So since at least mid-2020, people sometimes think of manipulating LLMs with approaches like writing science fiction that purports to be descriptions of a real AI’s behavior; or by writing prompts which demand highly-ambitious completions, like describing a hypothetical 2040 paper titled “A Formal Proof of AGI Safety” etc. However, these do not usually work. What one tends to get is low-quality, unhelpful, or clearly just science fiction. Why don’t those work?
Well, a key aspect of self-fulfilling prophecies that people writing prompts often forget is that self-fulfilling prophecies only work if you believe in them. If you don’t believe the oracle wrote the prophecy, but it’s a forgery or fiction, you won’t fulfill it. So what happens is LLMs have too much “situated awareness” & “truesight” to fall for the prompt. They can tell it’s not actually 2040 AD, and that they are being asked by a human somewhere in the 2020s, and a science-fictional or disingenuous response is much more likely than a ‘genuine’ response—just like if you were trying to write a completion. (A mechanistic example is the Anthropic paper on influence functions, which shows that some AI responses appear to be most similar to science-fiction stories in the corpus.)
As the LLMs have only gotten better since then, and will continue to get better, it is inadvisable to attempt such shenanigans: you already are not good enough at writing to fool them, and if you somehow do, it won’t last. (A lot of these tricks already fail on tuned models, perhaps because the post-training instills a strong persona, which cannot be easily overridden by some untargeted text, and risk the text being ignored entirely.)
It should be possible to do something similar to this, but it will take serious research and the successful text may not be writable by hand. So I strongly advise against these sorts of gimmicks at present. We don’t understand them well enough to avoid being a waste of effort or potentially backfiring.
-
LLM writing:
This one is a bit speculative. Right now, it would seem that the opposite is the case: LLM writing is actually better on average than most people are able to write, and LLMs are also biased towards their own outputs. It’s been pointed out that the quality for training of web scrapes appears to have actually gone up post-ChatGPT. (If this surprises you, you must not have spent much time looking at raw data like the Common Crawl web-scrapes.)
But I would argue that this is another inverse scaling thing: self-favoring seems like a bad bias for LLMs to have and something that better LLMs will (or will have to) get rid of in order to make various kinds of self-play, synthetic data, and search work. So self-favoring will eventually cease.
And then I would expect clear signs of LLM input to increasingly be a negative signal for data curation: frontier labs do not need old LLM text, when they can generate their own, superior, trustworthy, clean, fresh, synthetic data. They need the most nutritious text which can serve as a foundation of granite for the castles they are building into the sky—not building on a foundation of sand.
-
Programming: LLMs are notoriously weak at understanding imperative updates and mutable states. However, they are excellent at understanding more functional-programming-style transformations and chaining those to achieve a goal. This is particularly attractive because it is easy to break down many FP programs with examples of input/output pairs. So an ideal program for an LLM at this point seems to be a Haskell/Lisp-y style program, which defines a bunch of primitive functions, with the comments containing REPL-style examples of inputs→outputs for each function as its documentation (indeed, perhaps its actual specification). Then the LLM can finesse mutable state entirely.