-

Global GPU Computing News【20260524】

2026-05-24

1. Huawei Ascend 910C Fully Rolls Out, Accelerating Domestic Chip Substitution

At the Kunpeng Ascend Developer Conference this week, Huawei announced a full-stack AI DC data infrastructure solution to promote mass production and delivery of the Ascend 910C and 950PR chips. Jensen Huang recently publicly acknowledged that Huawei has become an effective substitute for NVIDIA in the Chinese market. Huawei is negotiating potential orders with giants like ByteDance, Baidu, and China Mobile, with a total value possibly reaching $2 billion, targeting shipment of 100,000 Ascend 910C units in 2026.

2. First 10,000-Card All-Domestic Exascale AI Computing Cluster Activated, Reaching 14,000 PFlops

China's first 10,000-card fully self-controllable exascale AI computing cluster has been officially activated, equipped with Huawei's advanced chips. The computing power scale reaches 11,000 PFlops, and together with a previously deployed 3,000 PFlops, the total reaches 14,000 PFlops. This marks a key step forward for the domestic computing power infrastructure in the direction of self-reliance.

3. NVIDIA H100 Rental Prices Surge, Global Computing "Inflation" Persists

Due to sustained high demand for AI GPUs, Nebius announced a price increase for the entire GPU series (H100, H200, B200) effective June 1. The H100 price will rise from 2.95to3.85 per hour (approx. 31% increase), and the B200 from 5.50to7.15. Alongside the price hike, leading North American cloud service providers are aggressively procuring NVIDIA GB and Rubin rack systems, which is expected to drive an explosive 122% increase in AI inference computing demand in 2026.

4. NVIDIA Vera Rubin Platform Enters Mass Production, Trillion-Level Computing Orders in Sight

At the GTC 2026 conference, NVIDIA officially announced that the next-generation Vera Rubin platform has entered full mass production. It integrates seven new chips and covers five rack systems, making trillion-level computing orders possible thanks to its supercomputing performance tailored for AI. Notably, the B200 single-GPU performance is 2.2 times higher than the H200 in MLPerf tests, while the HBM4-equipped Vera Rubin offers 5 times the inference computing power of Blackwell.

5. H200 Exports to China in Deadlock: U.S. Eases Restrictions, Yet Chinese Companies Place Zero Orders

Although the U.S. has voluntarily allowed exports of NVIDIA's high-end H200 AI chips to China, Chinese companies holding purchase permits have collectively placed zero orders. The reason is stringent attached conditions: 25% of revenue must be paid to the U.S. government, export volume cannot exceed 50% of U.S. domestic sales, and U.S. third-party lab security tests are required. Jensen Huang admitted that NVIDIA's market share in China's AI accelerator market is currently zero.

6. China's NDRC Pushes Domestic Computing Adaptation, Green Power Becomes a Hard Threshold for New Computing Centers

On May 22, China's National Development and Reform Commission (NDRC) made clear its directive to guide large domestic AI models to intensify adaptation to domestic computing chips to ensure self-reliance. Meanwhile, many regions have begun raising admission standards for new computing projects: large-scale computing centers must achieve green power supply and energy storage support; non-compliant projects will not be recorded or connected to the power grid.

7. ByteDance, Alibaba, Tencent Launch Hundreds of Billions in Computing Procurement Spree

ByteDance has raised its 2026 AI capital expenditure to approximately RMB 200 billion (25% above its original plan), of which about RMB 85 billion is for chip procurement, and it has already pre-ordered over $5 billion worth of domestic computing products. Alibaba's investment in cloud and AI infrastructure over the next five years will exceed the previously guided RMB 380 billion, with its Pingtouge computing cards accelerating shipments. According to a TrendForce report, nine major cloud service providers in North America and China have collectively raised their full-year capital expenditure guidance, signaling that the computing industry chain has formally entered a "full-chain inflation" cycle.

8. AMD Invests $10 Billion in Taiwan, China's AI Ecosystem; MI450 Series Uses 2nm Process

On May 21, AMD announced an investment of over $10 billion in the AI ecosystem of Taiwan, China, expanding supply chain partnerships and enhancing advanced packaging manufacturing capacity for next-generation AI infrastructure. On the GPU front, the Instinct MI450 series, expected to enter mass production in the second half of 2026, will adopt TSMC's 2nm advanced process, equipped with up to 432 GB of HBM4 memory and approximately 19.6 TB/s of bandwidth.

9. Domestic Computing Industry Chain Expands Intensively: Token Factories, Server Manufacturing, AI Computing Centers

Based on public information, from April to mid-May there were at least 67 domestic billion-yuan-level computing power tenders and bid award projects, covering key areas such as domestic server procurement, GPU cluster construction, computing leasing, token generation services, and AI computing center EPC. Notable developments include: Rongxin Zhiyuan signing a strategic cooperation with Yingtech to jointly advance the R&D and manufacturing of domestic computing servers; Maifushi signing an agreement with Muxi shares to collaborate on overseas AI computing center construction and token economy ecosystem; some computing firms planning to initially deploy four super-node servers, connecting clusters of over 1,500 GPUs to build a "Token Factory".

10. Jensen Huang Explores New FPGA Computing Power, Inference Market Becomes the Next Battlefield

NVIDIA is stepping out of its CUDA comfort zone and focusing on developing heterogeneous computing solutions that combine FPGA and GPU. As the Vera Rubin platform ships in volume in Q3, this solution will rapidly penetrate niche scenarios such as edge computing, industrial agents, and space computing. Meanwhile, Jensen Huang stated that SRAM-based LPX AI chips will remain a niche market for the long term, focused on low-latency and token-rate designs for inference, but less capable than GPUs for agentic tasks. Overall, the AI computing competition is shifting from training acceleration to inference scenarios, and Intel, with its 71% share of the server CPU market, is also expected to become a major winner in the AI inference era.

share