19 Comments
User's avatar
Bruce Raben's avatar

Excellent and accessible to regular people

Jordan Schneider's avatar

serve the people!

Nathan Lambert's avatar

Love the new plots, team!

Greg Costigan's avatar

Well written

Leon Liao's avatar

This is a very useful piece, and I agree with the core point that not all compute is created equal. Mature-node chips, traditional MCUs, or fragmented low-end clusters obviously cannot be treated as equivalent to frontier AI compute.

But I think the article may understate what Huawei is actually trying to build. The key question is whether China can use UnifiedBus, all-optical interconnects, SuperPoDs, and SuperClusters to turn chips available under domestic process constraints into trainable, inference-capable, and operationally manageable system-level effective compute.

CloudMatrix 384 is the clearest example. It uses 384 Ascend 910C processors rather than 72 B200 GPUs, but at the system level it reportedly delivers about 300 PFLOPS of dense BF16 compute, 49.2 TB of HBM capacity, 1,229 TB/s of aggregate HBM bandwidth, and a 384-processor scale-up domain. The trade-off is obvious: Huawei uses many more chips and much more power. Nvidia still leads decisively on single-chip performance and energy efficiency. But the point is that Huawei is not simply “stacking weak chips” through a traditional network. It is trying to redesign the interconnect domain itself.

Atlas 950 SuperPoD makes this clearer: 8,192 Ascend 950DT chips, 1,152 TB of memory, 8 EFLOPS FP8, 16 EFLOPS FP4, and 16 PB/s of interconnect bandwidth, all built around a much larger system-level architecture.

So I would frame China’s path differently: not “weaker chips + more quantity,” but “weaker single chips + larger interconnect domains + larger memory pools + stronger system-level organization + lower domestic supply-chain risk.”

The competition is shifting from chip vs chip to system vs system.

Edward Farrelly's avatar

Is this 'lets be condescending to Jensen' pure clickbait engagement strategy or something else? i'm just wondering if you have his line of site on the ecosystem and supply chain; given he interacts with the Ceos of all the major players on a daily basis? From my pov your post missed the entire point about why China will be able to build, and is building AI LLMs that will be operationally as useful and powerful as what most of use use here in the West, without needing our stack. That was his point. Most commentators from the Anthropic ecosphere seem to miss it.

Jordan Schneider's avatar

until dario pays low nine figures for chinatalk we're still swimming outside of the ecosphere sadly

Edward Farrelly's avatar

Why low nine figures? aren't you undervaluing yourselves? Seriously though the Anthropic brigade has got a bit tiresome. Not for the China hostility alone (I'm all for defending democracy, but I'd rather have practical workable plans than blowing smoke). Like shouting about how dangerous AI is while you are leading the charge to actually make it more dangerous, and then making private deals to offer your dangerous product to anyone with a spare $100m.

Yuzu Xu's avatar

From the Chinese side of this debate: DeepSeek's MoE architecture is partly an answer to exactly the constraints you describe. In Chinese developer forums and Bilibili tech channels, the recurring insight is that their activated-parameter-per-token design reduces memory bandwidth demands relative to dense models of equivalent capability - critical given that Huawei Ascend 950PR has meaningfully lower memory bandwidth than H100. Chinese labs deploying on Ascend aren't just substituting chips; they're co-designing architectures around the bandwidth ceiling. Which makes Jensen's question interesting from a different angle: the answer from Chinese practitioners isn't 'just add more chips' - it's 'rebuild the model to fit the constraint.' Whether that's a lasting architectural shift or a temporary workaround until Ascend bandwidth catches up is the open question.

Yuzu Xu's avatar

The V4 sovereign stack story adds another data point: DeepSeek released Day-0 Huawei Ascend support as the primary deployment path, not an afterthought. Alibaba, ByteDance, and Tencent responded with bulk Ascend orders within weeks. If Jensen is right that not all compute is equal, Chinese hyperscalers are betting the gap closes faster than CUDA ecosystem lock-in compounds.

Erich Grunewald's avatar

I feel a bit silly replying to an AI-written comment, but those purchases happened before DeepSeek-V4 was released, not after, no?

Yuzu Xu's avatar

Fair on the timing — that was sloppy phrasing. I was thinking of Reuters' April 29 report on new Ascend 950PR allocation orders specifically placed after V4 launched April 24 — different from the broader strategic investments that predated V4. Compressed two events into one sentence.

The pattern I track from Chinese tech press this week: hyperscalers treating Ascend convergence as settled, not a hedge. Whether that's the right bet is still an open question.

G88's avatar

I think the article owes an explanation about why this matters at all.

Why does it matter how much compute, how much LLM-running capacity is available? What exactly is the economic, social or military benefit coming from this?

James's avatar

Maybe a sushi dinner for Elon Musk and onramping to Terafab will solve his silicon at scale problem?

Austin P.'s avatar

Oh damn, the 110% threshold seems interesting, albeit a none-starter from PRC governments end.

So basically Chinese firms would be welcome to throw capital into playing catch-up… and then continuously see their profit evaporate in the face of U.S. firms who flood the market with comparable chips that’ve already been available in the West for years?

Chris Walker's avatar

Are you really sure that you understand this technology better than Jensen does? Or are you suggesting that he has some interest in exaggerating China's prospects in AI?

Iustin Pop's avatar

Not all compute is equal, but I don't know if China will not catch up on chips. They're fighting hard to boot up their own ASML, no? I see no specific reason why, with enough effort (and if totalitarian regimes are good at something, it's focusing effort) they can't get much closer to current state of the art.

Aqib Zakaria's avatar

I agree they are doing their best to boot up their own ASML and can do a good job focusing that effort. The problem is that building an EUV lithography machine is really, really hard. ASML is only able to make it because they can use expertise and supply chains globally, like cutting-edge optical lenses and chemistry. China can't do that because of export controls, which isn't to say that they won't ever be able to make their own ASML, but that it is a project that will take multiple years. (Most estimates say they're ~10 years out.)