PaddlePaddle

Chinese AI Deep Learning Open Source AI

12 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

23 citations

Revision

v2 · 2,390 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

PaddlePaddle (Chinese name Feijiang, 飞桨) is an open-source deep learning framework developed by the Chinese technology company Baidu, and it is generally described as the first deep learning platform developed independently in China ^[1]^[2]^[3]. Baidu released the code publicly in 2016, and the name is a backronym for "PArallel Distributed Deep LEarning." The framework emphasizes industrial, production-scale deployment: its official tagline is "PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice" ^[3]. Baidu's ERNIE family of language models is built and trained with PaddlePaddle, and according to Baidu the framework served more than 23.33 million developers and 760,000 enterprises as of September 2025 ^[3]^[12]. Beyond a core training and inference framework, PaddlePaddle has grown into a large ecosystem of development kits and model libraries, most notably PaddleOCR, an optical character recognition toolkit that is one of the most popular OCR projects on GitHub. The framework is released under the Apache License 2.0 ^[3].

What is PaddlePaddle?

PaddlePaddle is a general-purpose deep learning framework: software for building, training, and deploying neural networks, comparable in role to PyTorch and TensorFlow. It exposes a Python application programming interface (API) and supports both dynamic graphs (define-by-run, where the computation graph is built as code executes) and static graphs (define-and-run, where a graph is compiled before execution). Since the version 2.0 release the default is the dynamic-graph mode, which is easier to debug, while a static graph can be recovered for deployment and optimization ^[7]^[8]. The project README describes it as "the first independent R&D deep learning platform in China" that has been "officially open-sourced to professional communities since 2016" ^[3].

Distributed and parallel training has been a focus from the beginning, consistent with the "parallel distributed" in the name. The framework supports data parallelism, model parallelism, pipeline parallelism, and combinations of these, which Baidu markets for training very large language models. With the 3.0 release, PaddlePaddle introduced what it calls unified dynamic and static automatic parallelism: a developer annotates how tensors should be split based on a single-device program, and the framework automatically derives an efficient distributed strategy, which is meant to lower the engineering cost of large-model training ^[9]^[10].

Version 3.0 also added a deep-learning compiler called CINN (Compiler Infrastructure for Neural Networks). Baidu reported that on NVIDIA A100 hardware, compiled operators ran several times faster than uncompiled ones and that a majority of tested models saw end-to-end speedups, with an average improvement around 27 percent ^[9]^[10]. The 3.0 generation further emphasized hardware portability through a unified adaptation layer that Baidu said supported more than 60 chip series, so that the same model code can run across different accelerators ^[9]^[10]. The framework runs on CPUs and on GPUs from multiple vendors, and Baidu has worked with partners to support additional accelerators.

Who makes PaddlePaddle?

PaddlePaddle is developed and maintained by Baidu, the Chinese search and AI company. The internal project that became PaddlePaddle began at Baidu around 2013, when the company's deep learning lab found that single-GPU training tools could not keep up with the growth of its neural network workloads in areas such as search ranking and online advertising. The early system was led by Baidu researcher Wei Xu and was known internally simply as "Paddle" ^[1]^[4]. Baidu used it to develop products in advertising, search ranking, large-scale image classification, optical character recognition, and machine translation before opening it up to outside developers ^[5].

The platform was announced as an open-source project on September 1, 2016, at the Baidu World conference by Andrew Ng, who was then Baidu's chief scientist, and the code was made publicly available later that month ^[5]^[6]. The timing placed it among the first wave of company-backed deep learning frameworks, arriving a year after Google open-sourced TensorFlow in 2015 ^[2]. Development happens in the open on GitHub under the PaddlePaddle organization, where the core repository (PaddlePaddle/Paddle) has well over 20,000 stars and more than a thousand contributors ^[3].

When was PaddlePaddle released, and how has it evolved?

The platform was open-sourced in 2016. In its first years PaddlePaddle was primarily a static-graph framework. Baidu later rebuilt large parts of it, and the redesigned core was sometimes referred to as Paddle Fluid. Support for imperative, dynamic-graph programming was added as an option, and with the version 2.0 release in early 2021 the dynamic graph became the default programming mode, alongside a reorganized API ^[7]^[8]. The version 3.0 generation, released in 2025, unified dynamic and static execution under a single automatic-parallelism model and introduced a self-developed compiler. Baidu has continued to hold an annual developer conference, Wave Summit, where it announces new framework versions and reports on the size of the developer community.

The major framework releases are summarized below.

Version	Date	Notable changes
PaddlePaddle 1.0	2018	First major stable line; static-graph (Fluid) design ^[8]
PaddlePaddle 1.6	Wave Summit+ 2019 (Nov 2019)	Large feature update reported as 21 new features, 100+ models, Paddle Lite 2.0 ^[11]
PaddlePaddle 2.0	Early 2021	Dynamic graph as default; reorganized API; hundreds of new and revised APIs ^[7]^[8]
PaddlePaddle 3.0	April 2025	Unified dynamic/static automatic parallelism; CINN compiler; 60+ chip series support ^[9]^[10]
PaddlePaddle 3.2	Wave Summit 2025 (Sep 9, 2025)	Computational optimization, parallel strategies, native fault tolerance ^[12]
PaddlePaddle 3.3	January 31, 2026	Latest release line as of early 2026 ^[3]

What toolkits and model libraries does PaddlePaddle include?

A large part of PaddlePaddle's appeal is its collection of task-specific development kits and model libraries, which package datasets, pre-trained models, training scripts, and deployment tooling for common application areas. These let developers fine-tune and deploy strong baseline models without building pipelines from scratch ^[13]^[14]. The most widely used components are listed below.

Toolkit	Purpose
PaddleNLP	Natural language processing and large language model development, training, and inference ^[15]
PaddleOCR	Optical character recognition and document parsing ^[16]
PaddleClas	Image classification and recognition
PaddleDetection	Object detection, instance segmentation, tracking, and keypoint detection ^[13]
PaddleSeg	Image segmentation, with around 20 segmentation models and 50+ pre-trained models ^[17]
PaddleSpeech	Speech recognition and speech synthesis
PaddleHelix	Biocomputing for drug discovery and molecular work ^[13]
PARL	Reinforcement learning
Paddle Lite	Lightweight inference engine for mobile, embedded, and edge devices ^[11]
Paddle Serving	Server-side model deployment, with 40+ example pipelines ^[18]
PaddleHub	Library of ready-to-use pre-trained models ^[13]
PaddleX	Unified low-code interface across PaddleOCR, PaddleDetection, PaddleClas, PaddleSeg, and others ^[14]

Alongside these, Baidu runs AI Studio, an online learning and development platform with notebooks and free compute, and lower-code tools such as EasyDL for users who want to train models without writing much code.

What is PaddleOCR?

PaddleOCR is the best known toolkit in the ecosystem and one of the most starred OCR projects on GitHub, with on the order of tens of thousands of stars (about 79,000 as of mid-2026) ^[16]. It became widely adopted after the 2020 release of its PP-OCR series, a family of practical, lightweight recognition models that were small enough to deploy on modest hardware while still handling Chinese and English text. The toolkit supports more than 100 languages ^[16].

Later versions broadened it from plain text recognition into document understanding. PP-OCRv5 is a single-model pipeline for multilingual mixed text, and PP-StructureV3 adds layout analysis, table recognition, formula recognition, chart understanding, and reading-order recovery, converting complex PDFs and images into Markdown or JSON for downstream use by language models ^[16]. A 2025 technical report described PaddleOCR 3.0 as combining PP-OCRv5 for multilingual text recognition, PP-StructureV3 for hierarchical document parsing, and PP-ChatOCRv4 for key-information extraction ^[19]. The project is released under the Apache 2.0 license ^[16].

How is PaddlePaddle connected to ERNIE?

PaddlePaddle is the foundation Baidu uses to build and train its ERNIE (Enhanced Representation through kNowledge IntEgration) language models, and the two are marketed together as a combined platform. According to the official ERNIE repository, "All ERNIE models are trained with optimal efficiency using the PaddlePaddle deep learning framework, which also enables high-performance inference and streamlined deployment" ^[15]. PaddleNLP, the natural language processing kit, provides the large-model training and inference stack, including Baidu's 4D hybrid parallel strategies, fine-tuning methods, and high-performance inference ^[15]. For the open-weights ERNIE 4.5 family, Baidu also released ERNIEKit, an industrial development toolkit built on PaddlePaddle that covers pre-training, supervised fine-tuning, LoRA, direct preference optimization, and quantization workflows ^[15]. Baidu often reports its developer and enterprise numbers for the "PaddlePaddle-ERNIE" ecosystem as a whole rather than for the framework alone.

How many developers use PaddlePaddle?

Baidu reports the size of the PaddlePaddle community at its Wave Summit events, and the figures have grown quickly. By the fourth quarter of 2022, Baidu cited more than 5.35 million developers and over 200,000 enterprises using the platform ^[20]. At Wave Summit 2023 it reported 8 million developers and around 220,000 enterprises, and by December 2023 the developer count had reached roughly 10.7 million ^[21]. Baidu's official account stated that PaddlePaddle was serving 18.08 million developers as of October 2024 ^[22]. At Wave Summit 2025, on September 9, 2025, Baidu said the combined PaddlePaddle-ERNIE ecosystem served 23.33 million developers and 760,000 enterprises, with on the order of 1.1 million models created on the platform ^[12]^[3]. These numbers come from Baidu and have not been independently audited, but they indicate that PaddlePaddle is the most widely used domestically developed deep learning framework in China.

The project has also gained support outside Baidu. Hardware vendors including NVIDIA and Graphcore have added PaddlePaddle support to their stacks, and pre-trained PaddlePaddle and PaddleNLP models are distributed through the Hugging Face Hub ^[23].

How does PaddlePaddle compare to PyTorch and TensorFlow?

PaddlePaddle occupies a similar niche to PyTorch and TensorFlow: it is a general-purpose deep learning framework with a Python API, automatic differentiation, GPU acceleration, and tools for training and serving models. Like PyTorch, it now defaults to a dynamic, imperative style that is friendly for research and debugging, while retaining a static-graph path for optimized deployment ^[7]. Its main points of differentiation are a heavy emphasis on distributed and parallel training for industrial-scale models, a deep adaptation layer for a wide range of Chinese-market AI accelerators, and the unusually broad set of ready-made application toolkits that ship alongside the core library ^[9]^[13].

Adoption is the clearest difference in practice. PyTorch and TensorFlow dominate global research and industry usage, whereas PaddlePaddle's user base is concentrated in China, where it is the leading homegrown option and is closely tied to Baidu's cloud services and the ERNIE models. Its documentation and community materials are produced in both Chinese and English, but most of its momentum comes from the Chinese developer ecosystem ^[2]^[20].

Framework	Origin	First open-sourced	Default execution mode	Primary user base
PaddlePaddle	Baidu (China)	2016 ^[3]	Dynamic graph (since 2.0) ^[7]	China (industrial deployment, ERNIE) ^[2]
PyTorch	Meta AI (US)	2016	Dynamic graph	Global research and industry
TensorFlow	Google (US)	2015 ^[2]	Graph (eager since 2.0)	Global industry and production

Is PaddlePaddle open source?

Yes. The PaddlePaddle core framework and its major toolkits, including PaddleOCR and PaddleNLP, are open source under the Apache License 2.0, a permissive license that allows commercial use, modification, and redistribution ^[3]^[16]. Development happens in the open on GitHub under the PaddlePaddle organization, where the core repository (PaddlePaddle/Paddle) has well over 20,000 stars and more than a thousand contributors ^[3].

References

Andrew Ng-era background and Wei Xu's role in creating Paddle (2013), summarized in coverage of Baidu's deep learning history. Viso.ai, "PaddlePaddle: An open-source deep learning framework." https://viso.ai/deep-learning/paddlepaddle/ ↩
"Baidu's PaddlePaddle deep-learning platform fuels the rise of industrial AI." MIT Technology Review, June 22, 2020. https://www.technologyreview.com/2020/06/22/1004251/baidus-deep-learning-platform-fuels-the-rise-of-industrial-ai/ ↩
PaddlePaddle/Paddle GitHub repository (README; "first independent R&D deep learning platform in China," open-sourced since 2016, "Machine Learning Framework from Industrial Practice," 23.33M developers / 760,000 companies / 1.1M models, license, version 3.3, stars and contributors). https://github.com/PaddlePaddle/Paddle ↩
"Baidu IDL scientist Xu Wei joins Horizon Robotics as chief scientist of General AI lab." Gasgoo. https://autonews.gasgoo.com/70014937.html ↩
"Baidu to Open Source New Platform for Deep Learning Community." GlobeNewswire / Baidu, September 1, 2016. https://www.globenewswire.com/news-release/2016/09/01/1188805/0/en/Baidu-to-Open-Source-New-Platform-for-Deep-Learning-Community.html ↩
"Baidu unveils open source deep learning platform PaddlePaddle." Open Source For You, September 2016. https://www.opensourceforu.com/2016/09/baidu-open-source-deep-learning-platform-paddlepaddle/ ↩
"Baidu Releases 'PaddlePaddle' 2.0, Its Deep Learning Platform, With New Features Including Dynamic Graphs, Reorganized APIs." MarkTechPost, April 7, 2021. https://www.marktechpost.com/2021/04/07/baidu-releases-paddlepaddle-2-0-its-deep-learning-platform-with-new-features-including-dynamic-graphs-reorganized-apis/ ↩
"Baidu Releases PaddlePaddle 2.0 with New Features." Baidu Research blog. https://research.baidu.com/Blog/index-view?id=156 ↩
"PaddlePaddle 3.0 Officially Released: Supports Large Models Like Wenxin 4.5, Reduces Cross-Chip Adaptation Costs by 80%." AIbase, April 2025. https://www.aibase.com/news/16811 ↩
"Baidu Releases PaddlePaddle Framework 3.0 to Empower Intelligent Development in the Age of Large Models." AIbase. https://www.aibase.com/news/16840 ↩
Baidu Research, announcement of PaddlePaddle at Wave Summit+ 2019 (21 features, 100+ models, Paddle Lite 2.0). https://x.com/baiduresearch/status/1191824365999886336 ↩
"Baidu Unveils ERNIE X1.1, Open-Sources ERNIE-4.5, and Updates PaddlePaddle at Wave Summit 2025." The Rift, September 2025. https://www.therift.ai/news-feed/baidu-unveils-ernie-x1-1-open-sources-ernie-4-5-and-updates-paddlepaddle-at-wave-summit-2025 ↩
"Unleash AI Power with Baidu's PaddlePaddle Framework" (overview of PaddleDetection, PaddleSeg, PaddleOCR, PaddleHelix, PaddleHub). Viso.ai. https://viso.ai/deep-learning/paddlepaddle/ ↩
PaddlePaddle/PaddleX GitHub repository (unified interface across PaddleOCR, PaddleDetection, PaddleClas, PaddleSeg). https://github.com/PaddlePaddle/PaddleX ↩
PaddlePaddle/ERNIE GitHub repository ("All ERNIE models are trained with optimal efficiency using the PaddlePaddle deep learning framework"; ERNIE 4.5 and ERNIEKit; PaddleNLP large-model stack and 4D parallelism). https://github.com/PaddlePaddle/ERNIE ↩
PaddlePaddle/PaddleOCR GitHub repository (100+ languages, Apache 2.0, stars, PP-OCRv5, PP-StructureV3). https://github.com/PaddlePaddle/PaddleOCR ↩
"PaddleSeg: A High-Efficient Development Toolkit for Image Segmentation." arXiv:2101.06175. https://arxiv.org/pdf/2101.06175 ↩
PaddlePaddle/Serving GitHub repository (Paddle Serving deployment examples). https://github.com/PaddlePaddle/Serving ↩
"PaddleOCR 3.0 Technical Report." arXiv:2507.05595. https://arxiv.org/html/2507.05595v1 ↩
"Baidu Upgrades China's Biggest AI Platform PaddlePaddle." Yicai Global. https://www.yicaiglobal.com/news/baidu-upgrades-china-biggest-ai-platform-paddlepaddle-for-nearly-41-million-developers ↩
"Baidu's Cutting-Edge AI Innovations Take Center Stage at Wave Summit 2023." Metaverse Post. https://mpost.io/baidus-cutting-edge-ai-innovations-take-center-stage-at-wave-summit-2023/ ↩
Baidu Inc. official statement that PaddlePaddle has served 18.08 million developers as of October 2024, accompanying the PaddlePaddle 3.0 announcement. https://x.com/Baidu_Inc/status/1907720531060969521 ↩
"Welcome PaddlePaddle to the Hugging Face Hub." Hugging Face blog. https://huggingface.co/blog/paddlepaddle ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Baidu ERNIE X1 Kai Yu (Yu Kai)Safetensors

What is PaddlePaddle?

Who makes PaddlePaddle?

When was PaddlePaddle released, and how has it evolved?

What toolkits and model libraries does PaddlePaddle include?

What is PaddleOCR?

How is PaddlePaddle connected to ERNIE?

How many developers use PaddlePaddle?

How does PaddlePaddle compare to PyTorch and TensorFlow?

Is PaddlePaddle open source?

References

Improve this article

Related Articles

MindSpore

DeepSeek-OCR

InclusionAI

Qwen

ModelScope

CogVideoX

What links here

Related Articles

MindSpore

DeepSeek-OCR

InclusionAI

Qwen

ModelScope

CogVideoX

What links here