CANN (Huawei)

AI Infrastructure Chinese AI Developer Tools

9 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

10 citations

Revision

v2 · 1,802 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

CANN, short for Compute Architecture for Neural Networks, is the heterogeneous computing architecture and software stack that Huawei provides for its Ascend line of AI processors. It is the layer that sits between high-level AI frameworks and Ascend hardware, turning the computational graphs and operators that a framework emits into instructions that an Ascend neural processing unit (NPU) can execute. In the Ascend ecosystem CANN occupies the role that CUDA occupies for NVIDIA GPUs: it is the full-stack runtime and toolkit that makes the chips programmable, and its maturity is central to whether Huawei can offer a credible alternative to NVIDIA's software ecosystem.^[1]^[2] In August 2025 Huawei announced that it would open-source CANN to grow an independent developer ecosystem around Ascend and compete directly with NVIDIA's proprietary CUDA platform.^[8]

What is CANN?

Huawei's own documentation describes CANN as "a heterogeneous computing architecture launched by Ascend for AI scenarios" and "a crucial platform for improving the computing efficiency of the Ascend AI processors."^[1] It exposes a set of hierarchical programming interfaces so that developers can work at different levels of abstraction, from calling pre-built operators to writing custom kernels.^[1] The stack supports three main training and inference frameworks: Huawei's own MindSpore, PyTorch through an adapter, and TensorFlow through an adapter.^[1] Underneath those frameworks CANN distributes the work across heterogeneous hardware, assigning subgraphs to the CPU or the NPU and parallelizing the computation so that the available compute is kept busy.^[1]

CANN targets Huawei's Ascend silicon, which is built on the in-house Da Vinci architecture. The core of Da Vinci is a 3D Cube matrix-multiply engine optimized for the dense linear algebra at the heart of deep learning, and CANN is the software that lowers a model down onto these cores.^[1]^[2] The stack is released in versioned editions; as of the CANN Commercial Edition documentation it is at the 8.x series (for example version 8.0 and version 8.5.0), with new releases tracking the capabilities of successive Ascend chips such as the 910B and the 910C.^[1]^[3]

How is CANN's architecture layered?

CANN is organized as a set of layers that progressively lower a model from a framework graph down to hardware instructions. The main pieces, drawn from Huawei's documentation, are summarized below.

Layer	Component	Role
Application interface	AscendCL (Ascend Computing Language)	Unified C and Python style APIs for system configuration, runtime management, and model execution
Graph	Graph Engine (GE)	Builds and optimizes the computational graph into a form executable on Ascend chips
Operator development	Ascend C, TBE, AI CPU	Languages and frameworks for writing custom operators
Operator libraries	AOL, ATB	Pre-tuned high-performance operators, including a Transformer-focused library
Communication	HCCL	Collective communication across multiple Ascend processors for distributed training
Media	DVPP	Hardware-accelerated digital vision pre-processing for image and video data
Compiler and tuning	BiSheng compiler, AOE	Compiles operator code to binaries and auto-tunes performance
Runtime and driver	Runtime, driver	Task scheduling, memory management, and the kernel-mode interface to the hardware

At the top, AscendCL provides the programming surface that applications and frameworks call into. The Graph Engine (GE) is the module that takes a whole-model graph and adapts it into a more efficient graph that can run directly on Ascend chips; in the MindSpore source tree GE is the linkage between the framework front end and the hardware.^[1]^[4] Below the operator level, the Ascend Operator Library (AOL) supplies high-performance operators tuned for the hardware, and the Ascend Transformer Boost (ATB) library targets the attention and matrix patterns common in large language models. HCCL (Huawei Collective Communication Library) handles cross-device data exchange such as all-reduce in distributed training, and DVPP (Digital Vision Pre-Processing) offloads image decoding and resizing to dedicated hardware. The BiSheng compiler lowers operator code to binary, and the AOE (Ascend Optimization Engine) tool performs automatic performance tuning. The runtime schedules tasks onto the NPU and manages memory, while the driver provides the kernel-level interface to the silicon.^[1]

How do you write custom operators with Ascend C?

Ascend C is the operator programming language that Huawei offers for writing custom kernels on Ascend hardware. It is the rough counterpart to writing CUDA C++ kernels, and it is the layer developers reach for when an existing library operator does not cover their needs. An Ascend C operator is typically split into two parts: a host-side tiling program that decides how to partition data and orchestrate movement, and a device-side kernel program that schedules and pipelines the actual compute instructions on the NPU.^[5] Older operator development paths, including TBE (Tensor Boost Engine) and AI CPU operators, also exist within CANN, but Ascend C is the path Huawei has promoted for newer development. The associated toolchain includes utilities for generating, testing, debugging, and profiling operators.^[1]

How does CANN integrate with PyTorch, TensorFlow, and MindSpore?

Most practitioners do not call CANN directly; they reach it through a framework. PyTorch support is provided by torch_npu, the Ascend Extension for PyTorch. It registers an "npu" device through PyTorch's PrivateUse1 backend mechanism and bridges PyTorch's front end to the CANN runtime and libraries underneath. CANN must be installed first, because torch_npu depends on its runtime and operator libraries; the extension's version numbers follow a "{PyTorch version}-{Ascend version}" convention so that a given build is matched to a specific CANN release (for example a torch_npu build aligned with PyTorch 2.1.0 and a particular CANN 8.x version).^[6] MindSpore, Huawei's first-party framework, integrates with CANN natively through GE rather than through a separate adapter, and a TensorFlow adapter is also available.^[1] Beyond training frameworks, CANN is used as a backend by inference and serving software in the Ascend ecosystem, and there is a community-maintained CANN execution provider for ONNX Runtime.^[2]

How does CANN compare to CUDA?

The clearest way to understand CANN is by analogy to NVIDIA's stack, with the caveat that the analogy is approximate rather than a one-to-one mapping.

NVIDIA / CUDA	Huawei / CANN	Purpose
CUDA C++ kernels	Ascend C	Writing custom device kernels
CUDA Runtime API	AscendCL and runtime	Application-level device and execution control
cuDNN, cuBLAS	AOL, ATB	Pre-optimized math and neural-network operators
NCCL	HCCL	Multi-device collective communication
NVCC	BiSheng compiler	Compiling kernels to device binaries
nvJPEG and related	DVPP	Hardware media pre-processing

The functional pieces line up reasonably well, but the ecosystems are not equivalent in maturity. Press coverage of Huawei's plans noted that CUDA has been developed and refined for close to two decades, and that CANN would likely take years to approach that level of polish, tooling, and third-party library coverage.^[2] Huawei has itself acknowledged friction: at Huawei Connect 2025, Eric Xu said customer feedback after the open-sourcing of DeepSeek's models surfaced "many issues and expectations they've had with Ascend," which was part of the motivation for opening the stack.^[7]

Is CANN open source?

Huawei first announced on 5 August 2025, at the Ascend Computing Industry Development Summit, that it would open-source CANN, with rotating chairman Eric Xu framing the move as a way to "speed up innovation from developers" and "make Ascend easier to use," and positioning it as an alternative to NVIDIA's proprietary platform.^[8] The plan was laid out in more detail at Huawei Connect 2025, held in Shanghai from 18 to 20 September 2025.^[7]^[9]

Xu described a tiered approach for CANN: Huawei would open the interfaces for the compiler and the virtual instruction set, and fully open-source the rest of the software, based on the existing Ascend 910B and 910C designs, by 31 December 2025. He paired this with commitments to fully open-source the Mind series of application enablement kits and toolchains and the openPangu foundation models on the same deadline, and said future versions would have their open-source plans synchronized with product launches.^[7] Zhang Dixuan, president of Huawei's Ascend Computing Business, gave the staged schedule: all CANN operators were to be open-sourced on the GitCode platform by late September 2025, with core components including domain-specific libraries, GE, Ascend C, and the MindIE inference engine to follow by December 2025.^[9] A CANN Technical Steering Committee was set up to govern contributions and roadmaps, and Huawei pledged supporting resources reported as 1,500 PFLOPS of computing power and 30,000 development boards per year for the open-source community.^[9]

The open-sourcing is partial rather than total. Keeping the compiler and the virtual instruction set as opened interfaces, rather than fully open implementations, lets developers see and target the compilation path while Huawei retains some lower-level detail. Commentators also noted at the time that licensing terms, governance specifics, documentation quality, and the durability of community support remained to be proven.^[10]

What challenges does CANN's ecosystem face?

CANN is the focal point of a broader question about whether Chinese AI accelerators can build a durable software ecosystem against CUDA. The hardware side has advanced, with Ascend parts and the CloudMatrix systems built around them positioned against NVIDIA's high-end accelerators, but the software experience has lagged.^[2]^[8] The practical gaps are familiar ones for any challenger to an entrenched platform: a smaller body of tuned operators and third-party libraries, fewer developers fluent in the tools, and a documentation and debugging experience that reviewers describe as still behind CUDA's. Opening the stack is Huawei's attempt to close those gaps by letting external developers, Chinese AI companies, universities, and research institutions contribute operators and fixes directly, and by leaning on adapters such as torch_npu and compatibility with widely used serving software to lower the cost of moving existing PyTorch workloads onto Ascend. Whether that converts into the kind of self-sustaining ecosystem CUDA enjoys is the open question that CANN's trajectory will answer.^[2]^[7]^[8]

References

"Overview, Quick Start, CANN Commercial Edition 8.0.0," Huawei Ascend Community. https://www.hiascend.com/document/detail/en/canncommercial/800/quickstart/index/index.html ↩
"Huawei is making its Ascend AI GPU software toolkit open-source to better compete against CUDA," Tom's Hardware. https://www.tomshardware.com/tech-industry/artificial-intelligence/huawei-is-making-its-ascend-ai-gpu-software-toolkit-open-source-to-better-compete-against-cuda ↩
"Before You Start, CANN Commercial Edition 8.5.0," Huawei Ascend Community. https://www.hiascend.com/document/detail/en/canncommercial/850/index/index.html ↩
"Ascend/graphengine (GraphEngine, GE)," GitHub. https://github.com/Ascend/graphengine ↩
"Kernel Launch Operator Development, Ascend C Operator Development, CANN Commercial Edition 8.0.0," Huawei Ascend Community. https://www.hiascend.com/document/detail/en/canncommercial/800/opdevg/Ascendcopdevg/atlas_ascendc_10_0056.html ↩
"Ascend/pytorch: Ascend PyTorch adapter (torch_npu)," GitHub. https://github.com/Ascend/pytorch ↩
"Groundbreaking SuperPoD Interconnect: Leading a New Paradigm for AI Infrastructure (Eric Xu keynote)," Huawei. https://www.huawei.com/en/news/2025/9/hc-xu-keynote-speech ↩
"Tech war: Huawei to open-source AI chip toolkit to take on Nvidia's proprietary platform," South China Morning Post. https://www.scmp.com/tech/tech-war/article/3320852/tech-war-huawei-open-source-ai-chip-toolkit-take-nvidias-proprietary-platform ↩
"Ascend: Open for All to Build a Vibrant Ecosystem," Huawei. https://www.huawei.com/en/news/2025/9/hc-shengten-opensource ↩
"Huawei details open-source AI development roadmap at Huawei Connect 2025," AI News. https://www.artificialintelligence-news.com/news/huawei-open-source-ai-development-platform-technical-specs/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Huawei Huawei AI Huawei Ascend 910B Huawei PanGu MindSpore

What is CANN?

How is CANN's architecture layered?

How do you write custom operators with Ascend C?

How does CANN integrate with PyTorch, TensorFlow, and MindSpore?

How does CANN compare to CUDA?

Is CANN open source?

What challenges does CANN's ecosystem face?

References

Improve this article

Related Articles

Alibaba AI

Tencent

Huawei Ascend 910C

Huawei Ascend 910B

CUDA

Vector database

What links here

Related Articles

Alibaba AI

Tencent

Huawei Ascend 910C

Huawei Ascend 910B

CUDA

Vector database

What links here