Bugs, performance issues hinder Huawei’s AI chips

Bugs, performance issues hinder Huawei’s AI chips

Barcroft Media | Getty Images

China’s efforts to match US computing power in artificial intelligence are being hampered by bug-ridden software, with customers of leading AI chipmaker Huawei complaining about performance issues and the difficulty of switching from Nvidia products.

The Chinese technology giant has emerged as the frontrunner in the race to develop a domestic alternative to industry leader Nvidia, after Washington further tightened export controls on high-performance silicon last October.

Its Ascend series has become an increasingly popular option for Chinese AI groups to run inference, a process that applications such as OpenAI’s ChatGPT use to generate responses to queries.

But multiple industry insiders, including an AI engineer at a partner company, said the chips still lagged far behind Nvidia’s for the initial training of models. They blamed stability issues, slower inter-chip connectivity, and inferior software developed by Huawei called Cann.

Nvidia’s software platform, Cuda, is renowned as the company’s “secret sauce” for being easy for developers to use and capable of vastly accelerating data processing. Huawei is one of many companies trying to break Nvidia’s stranglehold on AI chips by creating alternative software.

Huawei’s own employees are among those complaining about Cann. One researcher, who declined to be named, said it made the Ascend product “difficult and unstable to use” and work on testing it was being hampered.

“When random errors occur, it is very difficult to find out where it comes from due to poor documentation. You need talented developers to read the source code to see what the issue is, which slows everything down. The coding is imperfect,” they said.

Another Chinese engineer briefed on Baidu’s use of the Huawei processors said the chips crashed frequently, complicating AI development work.

The Huawei researcher said crashes happened because it was difficult to use the hardware. “It is easy to get bad results because people don’t know much about the hardware itself,” they said.

To tackle the problem, Huawei has been sending engineers to help customers on site with transferring training code previously written on Cuda into Cann, according to multiple people familiar with the matter. Baidu, iFlytek, and Tencent are among the tech companies that have received teams of engineers, these people said.

Huawei declined to comment. Baidu, iFlytek, and Tencent did not respond to requests for comment.

A former Baidu employee said: “Huawei excels at customer service, so of course they have engineers on site at their big customers, helping them to use their chips.”

Huawei can leverage a huge workforce to accelerate the shift. According to the company, more than 50 percent of its 207,000 employees work in research and development, including the engineers dispatched to install technology for customers.

“Huawei’s advantage over Nvidia is it can work closely with its customers,” said technology analyst Tilly Zhang at consultancy Gavekal. “Unlike Nvidia, it has a large team of engineers to help solve clients’ problems and get them to transition to their hardware.”

Huawei has also set up an online portal for developers to give feedback on how its software can be improved.

After the US tightened export controls in October, Huawei raised the price of the Ascend 910B, its chip used for training, by 20 to 30 percent, according to people familiar with the matter.

Huawei’s customers have also expressed concern about supply constraints for the Ascend chip, likely due to manufacturing difficulties, with Chinese companies prevented from buying state-of-the-art chipmaking machinery from the Dutch company ASML.

Huawei has seen strong demand for its AI chips. It reported a 34 percent increase in first-half revenues on Thursday, without providing a breakdown of sales for its different businesses.

More than 50 foundational models have “been trained and iterated” on the Ascend chip, Huawei executive director Zhang Ping’an said at the World Artificial Intelligence Conference in Shanghai in July.

iFlytek has said its large language model has been trained exclusively on Huawei chips after Huawei sent a group of engineers to its headquarters in Hefei, eastern China, last year to integrate the technology.

© 2024 The Financial Times Ltd. All rights reserved. Not to be redistributed, copied, or modified in any way.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top