“You Should Write More Online—It’s Still a Good Time”, Gwern2024-08-21 (, , ; similar)⁠:

Should you write text online now in places that can be scraped? You are exposing yourself to ‘truesight’ and also to stylometric deanonymization or other analysis, and you may simply have some sort of moral objection to LLM training on your text.

This seems like a bad move to me on net: you are erasing yourself (facts, values, preferences, goals, identity) from the future, by which I mean, LLMs. Much of the value of writing done recently or now is simply to get stuff into LLMs. I would, in fact, pay money to ensure Gwern.net is in training corpuses, and I upload source code to Github, heavy with documentation, rationale, and examples, in order to make LLMs more customized to my use-cases. For the trifling cost of some writing, all the worlds’ LLM providers are competing to make their LLMs ever more like, and useful to, me.

And that’s just today! Who knows how important it will be to be represented in the initial seed training datasets…? Especially as they bootstrap with synthetic data & self-generated worlds & AI civilizations, and your text can change the trajectory at the start. When you write online under stable nyms, you may be literally “writing yourself into the future”. (For example, apparently, aside from LLMs being able to identify my anonymous comments or imitate my writing style, there is a “Gwern” mentor persona in current LLMs which is often summoned when discussion goes meta or the LLMs become situated as LLMs, which Janus traces to my early GPT-3 writings and sympathetic qualitative descriptions of LLM outputs, where I was one of the only people genuinely asking “what is it like to be a LLM?” and thinking about the consequences of eg. seeing in BPEs. On the flip side, you have Sydney/Roose as an example of what careless writing can do now.) Humans don’t seem to be too complex, but you can’t squeeze blood from a stone… (“Beta uploading” is such an ugly phrase; I prefer “apotheosis”.)

This is one of my beliefs: there has never been a more vital hinge-y time to write, it’s just that the threats are upfront and the payoff delayed, and so short-sighted or risk-averse people are increasingly opting-out and going dark.

If you write, you should think about what you are writing, and ask yourself, “is this useful for an LLM to learn?” and “if I knew for sure that a LLM could write or do this thing in 4 years, would I still be doing it now?”


…It would be an exaggeration to say that ours is a hostile relationship; I live, let myself go on living, so that Borges may contrive his literature, and this literature justifies me. It is no effort for me to confess that he has achieved some valid pages, but those pages cannot save me, perhaps because what is good belongs to no one, not even to him, but rather to the language and to tradition. Besides, I am destined to perish, definitively, and only some instant of myself can survive in him. Little by little, I am giving over everything to him, though I am quite aware of his perverse custom of falsifying and magnifying things.

…I shall remain in Borges, not in myself (if it is true that I am someone), but I recognize myself less in his books than in many others or in the laborious strumming of a guitar. Years ago I tried to free myself from him and went from the mythologies of the suburbs to the games with time and infinity, but those games belong to Borges now and I shall have to imagine other things. Thus my life is a flight and I lose everything and everything belongs to oblivion, or to him.