Skip to main content

The State of Chinese AI

⁠“The State of Chinese AI ”

[continued from diminishing returns discussion] [Scott Sumner:] …In case you read this reply, I was very interested in your tweet about the low price of some advanced computer chips in wholesale Chinese markets. Is your sense that this mostly reflects low demand, or the widespread evasion of sanctions?

My sense is that it’s a mix of multiple factors but mostly the former—an issue of demand side.

So for the sake of argument, let me sketch out an extreme bear case on Chinese AI, as a counterpoint to the more common “they’re just 6 months behind and will leapfrog Western AI at any moment thanks to the failure of the chip embargo and Western decadence” alarmism.

It is entirely possible that the sanctions hurt, but counterfactually their removal would not change the big picture here. There is plenty of sanctions evasion—Nvidia has sabotaged it as much as they could and H100 GPUs can be exported or bought many places—but the chip embargo mostly works by making it hard to create the big tightly-integrated high-quality GPU-datacenters owned by a single player who will devote it to a 3-month+ run to create a cutting-edge model at the frontier of capabilities. You don’t build that datacenter by smurfs smuggling a few H100s in their luggage. There are probably hundreds of thousands of H100s in mainland China now, in total, scattered penny-packet, a dozen here, a thousand there, 128 over there—but as long as they are not all in one place, fully integrated and debugged and able to train a single model flawlessly, for our purposes in thinking about AI risk and the frontier, those are not that important. Meanwhile in the USA, if Elon Musk wants to create a datacenter with 100k+ GPUs to train a GPT-5-killer, he can do so within a year or so, and it’s fine. He doesn’t have to worry about GPU supply—Huang is happy to give the GPUs to him, for divide-and-conquer commoditize-your-complement reasons.

With compute-supply shattered and usable just for small models or inferencing, it’s just a pure commodity race-to-the-bottom play with commoditized open-source models and near zero profits. The R&D is shortsightedly focused on hyper-optimizing existing model checkpoints, borrowing or cheating on others’ model capabilities rather than figuring out how to do things the right scalable way, and not on competing with GPT-5, and definitely not on finding the next big thing which could leapfrog Western AI. No exciting new models or breakthroughs, mostly just chasing Western taillights because that’s derisked and requires no leaps of faith. (Now they’re trying to clone GPT-4 coding skills! Now they’re trying to clone Sora! Now they’re trying to clone Midjourney v6!) The open-source models like DeepSeek or Llama are good for some things… but only some things. They are very cheap at those things, granted, but there’s nothing there to really stir the animal spirits. So demand is highly constrained. Even if those were free, it’d be hard to find much transformative economy-wide scale uses right away.

And would you be allowed to transform or bestir the animal spirits? The animal spirits in China need a lot of stirring these days. Who wants to splurge on AI subscriptions? Who wants to splurge on AI R&D? Who wants to splurge on big datacenters groaning with smuggled GPUs? Who wants to pay high salaries for anything? Who wants to start a startup where if it fails you will be held personally liable and forced to pay back investors with your life savings or apartment? Who wants to be Jack Ma? Who wants to preserve old Internet content which becomes ever more politically risky as the party line inevitably changes? Generative models are not “high-quality development”, really, nor do they line up nicely with CCP priorities like Taiwan reunification. Who wants to go overseas and try to learn there, and become suspect? Who wants to say that ‘maybe Chairman Xi blew it on AI’? And so on.

Put it all together, and you get an AI ecosystem which has lots of native potential, but which isn’t being realized for deep hard to fix structural reasons, and which will keep consistently underperforming and ‘somehow’ always being “just 6 months behind” Western AI, and which will mostly keep doing so even if obvious barriers like sanctions are dropped. They will catch up to any given achievement, but by that point the leading edge will have moved on, and the obstacles may get more daunting with each scaleup. It is not hard to catch up to a new model which was trained on 128 GPUs with a modest effort by one or two enthusiastic research groups at a company like Baidu or at Tsinghua. It may be a lot harder to catch up with the leading edge model in 4 years which was trained in however they are being trained then, like some wild self-play bootstrap on a million new GPUs consuming multiple nuclear power plants’ outputs. Where is the will at Baidu or Alibaba or Tencent for that? I don’t see it…everything I’ve said here is public information you can find in Sixth Tone or the New York Times or Financial Times etc. So keep these points in mind as you watch events unfold. 6 months from now, are you reading research papers written in Mandarin or in English, and where did the latest and greatest research result everyone is rushing to imitate come from? 12 months from now, is the best GPU/AI datacenter in the world in mainland China, or somewhere else (like in America)? 18 months now, are you using a Chinese LLM for the most difficult and demanding tasks because it’s substantially, undeniably better than any tired Western LLM? As time passes, just ask yourself, “do I live in the world according to Gwern’s narrative, or do I instead live in the ‘accelerate or die’ world of an Alexandr Wang or Beff Jezos type? What did I think back in November 2024, and would what I see, and don’t see, surprise me now?” If you go back and read articles in Wired or discussions on Reddit in 2019 about scaling and the Chinese threat, which arguments predicted 2024 better?


I don’t necessarily believe all this too strongly, because China is far away and I don’t know any Mandarin. But until I see the China hawks make better arguments and explain things like why it’s 2024 and we’re still arguing about this with the same imminent-China narratives from 2019 or earlier, and where all the indigenous China AI breakthroughs are which should impress the hell out of me and make me wish I knew Mandarin so I could read the research papers, I’ll keep staking out this position and reminding people that it is far from obvious that there is a real AI arms race with China right now or that Chinese AI is in rude health.

[Also worth noting: where is EU, Japanese, or South Korean DL?]