Unlock your trading potential! Become a verified Bitget elite trader and earn 10,000 USDT to help skyrocket your profits. Join now and start your journey to success!
Share link:In this post: Jack Ma-backed Ant Group Co. announced that it was developing cost-effective AI training models using Chinese-made semiconductors. The AI models are based on the Mixture of Experts approach, and they allegedly reduce cost by 20%, bringing stiff competition to U.S. companies like Nvidia. This month, the company published a research paper claiming its models outperformed Meta Platforms Inc. at times in certain yet-to-be-verified benchmarks.
Ant Group revealed that it has developed new techniques for training artificial intelligence models utilizing Chinese-made semiconductors from Alibaba and Huawei. The AI training models use the Mixture of Experts (MoE) machine learning approach to achieve results similar to those from Nvidia’s H800 chips at a cheaper cost by at least 20%.
The Ant Group claimed that it was still using Nvidia for AI development but is now relying mostly on alternatives, including Advanced Micro Devices Inc. and Chinese chips, for its latest models. The firm disclosed that it cost about 6.35M yuan (roughly $880K) to train 1T tokens using high-performance hardware, but its optimized approach would cut that cost down to 5.1M yuan using lower-specification hardware.
Senior analyst at Bloomberg Intelligence Robert Lea said Ant Group’s claim, if confirmed, highlighted that China was well on the way to becoming self-sufficient in AI as the country turned to lower-cost, computationally efficient models, to work around the export controls on Nvidia chips. Nvidia CEO Jensen Huang argued that computation demand will grow even with the rise of more efficient models like DeepSeek’s R1, positing that companies will need better chips to generate more revenue, not cheaper ones to cut costs.
Ant Group leverages China-made chips for its latest AI innovation
Ant Group Co. used Chinese-made chips from Alibaba and Huawei to develop techniques based on the MoE approach for training AI models that would cut costs by 20%, according to Minmin Low (on Bloomberg TV’s ‘The China Show’). Low explained that using the MoE approach broke down tasks into smaller datasets to make it more efficient, ‘similar to employing a team of experts, each focusing on a specific part of the problem to enhance the overall efficiency.’
See also EU risks escalating Trump conflict in latest crackdown on Google and Apple
Minmin Low on The China Show. Source: Bloomberg .
According to Bloomberg, the AI training models marked Ant’s entry into a race between Chinese and U.S. companies that had accelerated since DeepSeek demonstrated how capable models could be trained for far less than the billions invested by OpenAI and Alphabet Inc.’s Google. Ant Group’s latest AI innovation emphasized how Chinese companies were trying to use local alternatives to the most advanced Nvidia H800 chips currently barred by the US from China.
“If you find one point of attack to beat the world’s best kung fu master, you can still say you beat them, which is why real-world application is important.” – Robin Yu , chief technology officer at Shengshang Tech Co.
Ant Group published a research paper this month that claimed its models sometimes outperformed Meta Platforms Inc. in certain unverified benchmarks. If the models work as advertised, Ant’s platforms could mark another step forward for Chinese AI development.
MoE AI training gains recognition for its use by Google and DeepSeek
Bloomberg reported that MoE AI training models were a popular option that had gained recognition for their use by Google and Hangzhou startup DeepSeek. Ant plans to leverage the recent breakthrough in the large language models (Ling-Plus and Ling-Lite) it has developed for industrial AI solutions, including healthcare and finance.
See also Starlink and Italian government's deal has stalled, in disapproval of Musk’s Doge role
Ant said in its research paper that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models. Both Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that work like knobs and dials to direct the model’s performance. Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, the MIT Technology Review estimated that ChatGPT’s GPT-4.5 had 1.8 trillion parameters, while DeepSeek-R1 had 671 billion.
Ant also disclosed that it faced challenges in some areas of the AI training, including stability. Even small changes in the hardware or the AI training model’s structure led to problems, including jumps in the models’ error rate.
Cryptopolitan Academy: Coming Soon - A New Way to Earn Passive Income with DeFi in 2025. Learn More
0
0
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.