Baidu ERNIE
Last reviewed
Sources
No citations yet
Review status
Needs citations
Revision
v4 · 5,112 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Sources
No citations yet
Review status
Needs citations
Revision
v4 · 5,112 words
Add missing citations, update stale details, or suggest a clearer explanation.
Baidu ERNIE (Enhanced Representation through Knowledge Integration) is the family of large language and multimodal foundation models built by the Chinese technology company Baidu, spanning the original 2019 knowledge-aware BERT-style pretraining paper, the consumer chatbot ERNIE Bot (Chinese name Wenxin Yiyan, 文心一言) launched on March 16, 2023, and the open-source ERNIE 4.5 release of June 30, 2025.[1][2][4] Over more than half a decade the lineage has scaled from a 100-million-parameter Chinese encoder to a 260-billion-parameter dense model (ERNIE 3.0 Titan) and on to native-multimodal mixture-of-experts systems reaching 424 billion total parameters in the open-source ERNIE 4.5 family and a claimed 2.4 trillion parameters in ERNIE 5.0.[3][4][19] ERNIE is both the brand for Baidu's foundation models and the engine of the Qianfan (千帆) cloud platform that distributes them to enterprises.[5] It is the canonical Baidu entry in the Chinese-language model field that also includes Alibaba's Tongyi Qianwen, Tencent's Hunyuan, ByteDance's Doubao, and DeepSeek.
| Field | Value |
|---|---|
| Developer | Baidu, Inc. (Baidu AI / Baidu Research) |
| Origin paper | Sun et al., "ERNIE: Enhanced Representation through Knowledge Integration", arXiv:1904.09223, April 19, 2019[1] |
| First product release | ERNIE Bot / Wenxin Yiyan, invited beta March 16, 2023[2] |
| Public release | August 31, 2023 (after Chinese regulatory approval)[6] |
| Latest flagship cited | ERNIE 5.0 (preview November 13, 2025; formal release January 22, 2026)[19][21] |
| First reasoning model | ERNIE X1 (announced March 16, 2025), iterated to ERNIE X1.1 (September 9, 2025)[7][20] |
| Open-source release | ERNIE 4.5 family on Apache 2.0, June 30, 2025[4] |
| Cloud platform | Baidu Qianfan Foundation Model Platform[5] |
| Underlying framework | Transformer architectures on Baidu PaddlePaddle[8] |
Baidu ERNIE is a multi-generational family of foundation models whose defining technical idea is knowledge integration: rather than learning purely from surface text co-occurrence, ERNIE models are trained with objectives that inject structured world knowledge (named entities, phrases, and knowledge-graph triples) into pretraining. The name is a backronym for "Enhanced Representation through Knowledge Integration", and the broader Chinese family brand is Wenxin (文心, "literary heart").[1][5] The family has three faces: the research line (the ERNIE 1.0 through 3.0 papers), the consumer product (ERNIE Bot / Wenxin Yiyan), and the enterprise distribution channel (the Qianfan cloud platform).[1][2][5]
The first ERNIE paper was posted to arXiv on April 19, 2019 by Yu Sun, Shuohuan Wang and colleagues at Baidu under the title "ERNIE: Enhanced Representation through Knowledge Integration".[1] Its central idea was that the original BERT-style masked language modelling objective, which randomly masks individual subword tokens, fails to capture the multi-word units that carry knowledge in natural language.[1] ERNIE 1.0 therefore introduced two additional masking strategies on top of word-level masking: phrase-level masking, in which whole conceptual phrases (for example, idioms or noun phrases) are masked together, and entity-level masking, in which named entities such as people, locations, and organizations identified by a named entity recognition tagger are masked as units.[1] The model was pretrained on a Chinese corpus drawn from Baidu Baike, Baidu News, and Baidu Tieba, and reported new state-of-the-art results on five Chinese NLP tasks: natural language inference, semantic similarity, NER, sentiment analysis, and question answering.[1] ERNIE 1.0 was released open-source through the company's deep-learning framework PaddlePaddle.[8]
ERNIE 2.0, posted to arXiv on July 29, 2019 and accepted to AAAI 2020, generalized the knowledge-masking idea into a continual pretraining framework.[9] Rather than a fixed set of objectives, ERNIE 2.0 incrementally adds new self-supervised tasks, training them jointly with previously learned tasks through what the authors call continual multi-task learning.[9] The framework defines tasks at three levels of linguistic information: word-aware tasks (such as knowledge masking and capitalization prediction), structure-aware tasks (sentence reordering, sentence distance), and semantic-aware tasks (discourse relation, IR relevance).[9] The published model reported gains over BERT and XLNet on 16 English and Chinese benchmarks, including GLUE.[9] Baidu open-sourced both the framework and pretrained weights through the PaddlePaddle ERNIE repository.[8]
On July 5, 2021 Baidu posted arXiv:2107.02137, "ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation".[3] ERNIE 3.0 was a 10-billion-parameter knowledge-enhanced model trained on a 4-terabyte corpus combining web text, Baidu Search sources, domain-specific data (medical, legal, financial), and Baidu's knowledge graph containing more than 50 million facts.[3] Architecturally it unified an autoregressive and an autoencoding component, allowing the same backbone to be used for both natural-language understanding and natural-language generation tasks.[3] An English variant of the model topped the SuperGLUE benchmark on July 3, 2021, with a reported score of 90.6 versus 89.8 for the human baseline.[3]
In December 2021 Baidu Research, jointly with the Peng Cheng Laboratory (PCL) in Shenzhen, released ERNIE 3.0 Titan (arXiv:2112.12731), scaling the architecture up to 260 billion parameters.[10] Baidu described it as the largest Chinese dense pretrained language model at the time, claiming new state-of-the-art results on more than 68 Chinese NLP datasets.[10] The Titan paper added two distinctive training innovations: a self-supervised adversarial loss intended to make the model distinguish generated from real text, and a controllable language modelling loss that conditioned generation on attributes such as genre or sentiment.[10] To control the compute footprint of distillation, the paper also proposed an online distillation framework in which the teacher continued training while teaching students.[10] The 260-billion-parameter checkpoint was contemporary with rivals such as the Yuan 1.0 model and Huawei's Pangu-alpha and Pangu-Sigma series.
On March 16, 2023 Baidu cofounder and chief executive Robin Li unveiled Wenxin Yiyan (文心一言), branded in English as ERNIE Bot, at a press event in Beijing.[2] The product, positioned as the company's ChatGPT competitor, launched as an invited beta and was demonstrated through prerecorded video clips covering mathematical reasoning, marketing copy, Chinese literary trivia, and multimodal output, including text-to-image illustrations and audio in regional Chinese dialects.[2] Addressing concerns about the timing of the launch, Li told the audience: "From what I personally saw when conducting internal tests on Ernie Bot, it's not perfect. But why do we want to release it today? Because the market demands it."[2] MIT Technology Review reported that Baidu's Hong Kong-listed shares fell 6.4 percent on the launch day, that 650 enterprise partners had pre-registered, and that over 30,000 companies had applied for API access by the close of the event.[2] The Wenxin Yiyan brand sat above earlier ERNIE foundation-model versions; Baidu's public communications described ERNIE Bot as powered initially by the ERNIE 3.0 family and quickly upgraded to ERNIE 3.5 in June 2023.[11]
ERNIE Bot was opened to the general public in mainland China on August 31, 2023, after Baidu became one of the first companies to receive approval under China's interim measures for generative AI services that took effect on August 15, 2023.[6] Reuters and TechNode reported that this regulatory approval was a precondition for mass-market rollout in China, where consumer AI services require security assessments unlike in most Western jurisdictions.[6] By December 2023 Baidu's published figures put ERNIE Bot users above 100 million; the company subsequently reported approximately 200 million users in April 2024 and 300 million by mid-2024.[11]
On October 17, 2023, at the in-person Baidu World 2023 conference in Beijing, Baidu launched the ERNIE 4.0 foundation model.[12] Baidu chief technology officer Haifeng Wang framed ERNIE 4.0 as substantially better than ERNIE 3.5 across four axes the company calls understanding, generation, reasoning, and memory, claiming a roughly 30 percent overall improvement during beta testing.[12] The launch was tied to a re-platforming of Baidu's flagship consumer products, with Baidu Search, Baidu Wenku, Baidu Drive, Baidu Maps, the Infoflow workplace app, and the Baidu GBI business-intelligence product all being rebuilt around generative AI, and the Qianfan Foundation Model Platform offering 42 foundation models to enterprise customers.[12] No parameter count was publicly disclosed.[12]
On June 28, 2024 Baidu announced ERNIE 4.0 Turbo, presented as a faster, cheaper, higher-quality upgrade.[13] Baidu disclosed that ERNIE 4.0 Turbo was priced at 30 RMB (about USD 4.13) per million input tokens and 60 RMB per million output tokens, and that price cuts of up to 83 percent had been applied to earlier ERNIE tiers to remain competitive.[13] The same announcement was accompanied by PaddlePaddle 3.0, Baidu's deep-learning framework upgrade.[13]
In May 2024 Baidu announced that the lightweight tier models ERNIE Speed and ERNIE Lite were free for API users through Qianfan, marking the company's most aggressive move in the Chinese model-pricing war.[5] Baidu followed in early 2025 with an announcement that consumer ERNIE Bot would become free for individual users by April 1, 2025, a date later brought forward to coincide with the ERNIE 4.5 launch.[14]
On March 16, 2025, exactly two years after the original Wenxin Yiyan press event, Baidu announced two models simultaneously: ERNIE 4.5, the new flagship multimodal foundation model, and ERNIE X1, the company's first deep-thinking reasoning model.[7] ERNIE 4.5 was positioned as a native multimodal model trained jointly on text, images, audio, and video; ERNIE X1 was framed as Baidu's response to DeepSeek-R1, with Baidu claiming X1 "delivers performance on par with DeepSeek-R1 at half the price".[7][15] At launch ERNIE X1 was priced at USD 0.28 per million input tokens and USD 1.10 per million output tokens, while ERNIE 4.5 was priced at USD 0.55 per million input tokens and USD 2.20 per million output tokens.[15] At the same event Baidu announced that ERNIE Bot would become free for individual users ahead of its previously scheduled April 1 date.[7]
At its Create 2025 developer conference in April 2025, Baidu announced ERNIE 4.5 Turbo and ERNIE X1 Turbo.[16] Baidu disclosed that ERNIE 4.5 Turbo was priced at 20 percent of the standard ERNIE 4.5 tier, and ERNIE X1 Turbo at half the price of ERNIE X1, while adding enhanced reasoning and multimodal capabilities.[16] On June 30, 2025 Baidu released the ERNIE 4.5 model family as open-source under the Apache 2.0 licence through Hugging Face, GitHub and the PaddlePaddle ecosystem.[4] The release comprised 10 models, including mixture-of-experts (MoE) variants with up to 424 billion total parameters and roughly 47 billion active parameters per token, alongside a 3-billion-active MoE and a 0.3-billion dense model.[4] Baidu's release blog stated that "all models are publicly accessible under Apache 2.0 to support future research and development in the field" and that the company also open-sourced "the development toolkits for ERNIE 4.5, featuring industrial-grade capabilities, resource-efficient training and inference workflows, and multi-hardware compatibility".[22] Baidu reported that the flagship ERNIE-4.5-300B-A47B-Base surpassed DeepSeek V3 (DeepSeek-V3-671B-A37B-Base) on 22 of 28 benchmarks.[22]
The decision to open-source ERNIE 4.5 was a notable strategic reversal for Robin Li, who had historically favoured keeping Baidu's flagship models closed. Announcing the plan in mid-February 2025, Li said that open source would "spread the technology much faster".[15]
At the WAVE SUMMIT 2025 developer conference on September 9, 2025, Baidu unveiled ERNIE X1.1, an upgraded reasoning model built on the ERNIE 4.5 multimodal foundation model and trained with what Baidu describes as "an iterative hybrid reinforcement learning framework that blends mixed reinforcement learning and iterative self-distillation".[20] Compared with its predecessor, Baidu reported that ERNIE X1.1's factuality was up 34.8 percent, instruction following up 12.5 percent, and agentic capabilities up 9.6 percent.[20] The company claimed that ERNIE X1.1's overall performance surpassed DeepSeek R1-0528 and was "on par with other top-tier models such as GPT-5 and Gemini 2.5 Pro".[20] The model was made available through the ERNIE Bot website, the Wenxiaoyan app, and the Qianfan platform for enterprise clients and developers.[20]
At Baidu World 2025 on November 13, 2025 Baidu previewed ERNIE 5.0, described as a natively unified omni-modal model with up to 2.4 trillion parameters capable of jointly understanding and generating text, images, audio and video.[19] Baidu stated that ERNIE 5.0 "jointly models text, images, audio, and video" from the ground up rather than bolting separate modality encoders onto a text model, and emphasised gains in multimodal understanding, instruction following, creative writing, factual reasoning, agentic planning, and tool use.[19] The preview was released through ERNIE Bot for consumers and the Qianfan MaaS platform for enterprise users.[19] The formal version of ERNIE 5.0 followed on January 22, 2026, with Baidu reporting that its language and multimodal understanding capabilities were comparable to top international models including Gemini 2.5 Pro and GPT-5-High.[21] Consistent with the company's recent shift away from publishing architecture papers for its flagship systems, public documentation for ERNIE 5.0 has focused on capability demonstrations rather than a detailed technical report.
The original 2019 ERNIE paper kept the Transformer encoder architecture of BERT but replaced the random subword masking objective with a three-stage knowledge-integration masking schedule.[1] In the first stage, basic word-level masking trains the model the way BERT does. In the second stage, phrase-level masking masks contiguous spans identified as phrases by chunking tools and lexicons. In the third stage, entity-level masking masks spans identified by a named entity recognizer.[1] The intent is that to reconstruct a masked entity such as a country name or scientist's name, the model must learn about that entity's relationships from context rather than from surface co-occurrence.[1]
ERNIE 2.0 retained the encoder architecture but introduced a sequence-of-tasks design.[9] Each new pretraining task is added to a continually growing pool, and at every step the model is trained on a mix of all current tasks with their losses summed.[9] Tasks are deliberately heterogeneous: knowledge masking (lexical), capitalization prediction (lexical), token-document relation prediction (lexical), sentence reordering and sentence distance (structural), and discourse and information-retrieval relevance (semantic).[9] The continual setting is contrasted with the catastrophic forgetting that single-task continual pretraining produces.
ERNIE 3.0 added explicit support for natural language generation by stacking a top "task-specific" set of Transformer layers on a shared "universal representation" backbone.[3] The universal backbone is bidirectional; the task-specific layers are unidirectional for generation and bidirectional for understanding.[3] During pretraining the model is trained jointly on autoregressive next-token prediction and autoencoding masked-token prediction, and structured knowledge from a knowledge graph is integrated through "universal knowledge-text prediction", a task that masks tokens in triples drawn from Baidu's knowledge graph and asks the model to predict them given the surrounding free text, or vice versa.[3] This is the core mechanism through which ERNIE distinguishes itself from purely text-pretrained models.
ERNIE 3.0 Titan grew the network to 260 billion parameters, trained on roughly 4 terabytes of Chinese text covering 11 source types.[10] Two additional losses were used: a self-supervised adversarial loss in which the model is trained to discriminate between text it produced and text drawn from the training distribution, intended to reduce repetition and hallucination; and a controllable language modelling loss that conditions generation on attribute tokens such as genre, sentiment, length, or topic.[10] The paper also proposed an online distillation pipeline that trains many smaller student models from the Titan teacher in parallel, sharing forward passes to reduce compute cost.[10]
Baidu extended the family to vision and language with ERNIE-ViL, a vision-language pretraining model, and ERNIE-ViLG, an early text-to-image model.[17] ERNIE-ViLG 2.0, posted to arXiv on October 27, 2022 and accepted to CVPR 2023, is a 24-billion-parameter diffusion model for Chinese text-to-image generation.[17] It departs from a single Stable Diffusion-style UNet by using a mixture-of-denoising-experts architecture in which different expert networks are applied at different stages of the denoising schedule, and by injecting fine-grained scene knowledge into the cross-attention pathway.[17] On MS-COCO it reported a state-of-the-art zero-shot FID-30k of 6.75 at the time of publication.[17]
The ERNIE 4.5 family released on June 30, 2025 mixes dense and MoE models.[4] The largest configuration uses 424 billion total parameters with 47 billion active per token; a second MoE size uses 3 billion active parameters; and a 0.3-billion dense model targets edge inference.[4] The MoE design uses a heterogeneous modality structure that shares some parameters across modalities while reserving dedicated parameters for each individual modality, which Baidu says improves multimodal understanding without degrading text performance.[4][22] Baidu described the line as natively multimodal, accepting interleaved text, image, audio, and video inputs, and shipped them with ERNIEKit, a PaddlePaddle-based training toolkit covering pre-training, supervised fine-tuning, LoRA, DPO, and quantization-aware training.[4]
| Variant | First public reference | Notes |
|---|---|---|
| ERNIE 1.0 | arXiv 1904.09223, Apr 2019[1] | Knowledge-masking BERT for Chinese |
| ERNIE 2.0 | arXiv 1907.12412, Jul 2019[9] | Continual multi-task pretraining (AAAI 2020) |
| ERNIE 3.0 | arXiv 2107.02137, Jul 2021[3] | 10B parameter unified LM, SuperGLUE first place |
| ERNIE 3.0 Titan | arXiv 2112.12731, Dec 2021[10] | 260B parameters, Baidu and PCL |
| ERNIE-ViLG 2.0 | arXiv 2210.15257, Oct 2022[17] | 24B-parameter text-to-image diffusion |
| ERNIE Bot / Wenxin Yiyan | March 16, 2023[2] | Consumer chatbot launch |
| ERNIE 3.5 | June 2023[11] | Backend upgrade behind ERNIE Bot |
| ERNIE 4.0 | October 17, 2023[12] | Baidu World 2023, AI-native products |
| ERNIE Speed / ERNIE Lite | May 2024[5] | Free lightweight API tiers on Qianfan |
| ERNIE 4.0 Turbo | June 28, 2024[13] | Faster, cheaper variant |
| ERNIE 4.5 | March 16, 2025[7] | Native multimodal flagship |
| ERNIE X1 | March 16, 2025[7] | Deep-thinking reasoning model |
| ERNIE 4.5 Turbo / X1 Turbo | April 2025[16] | Turbo tier, 4.5 Turbo at 20% of 4.5 price |
| ERNIE 4.5 open-source | June 30, 2025[4] | Apache 2.0, 10 model variants, up to 424B params |
| ERNIE X1.1 | September 9, 2025[20] | Upgraded reasoning model on ERNIE 4.5 |
| ERNIE 5.0 | Nov 13, 2025 (preview); Jan 22, 2026 (release)[19][21] | 2.4-trillion-parameter omni-modal model |
The Qianfan (千帆) Foundation Model Platform is the cloud-side distribution point for these models, offering them through Baidu AI Cloud with API access, fine-tuning, dataset management, prompt engineering, and an "AppBuilder" layer for assembling enterprise applications.[5] The platform also hosts third-party open-source models alongside ERNIE.[5]
ERNIE Bot (Wenxin Yiyan, 文心一言) is Baidu's consumer-facing generative AI chatbot, launched in invited beta on March 16, 2023 and opened to the public in mainland China on August 31, 2023.[2][6] It is the application surface for the ERNIE foundation models: a user chatting with ERNIE Bot is interacting with whichever ERNIE generation is currently deployed behind it (ERNIE 3.5 in 2023, ERNIE 4.0 from late 2023, ERNIE 4.5 and the X-series reasoning models from 2025). Baidu reports that ERNIE Bot reached over 100 million users by December 2023, 200 million by April 2024, and 300 million by mid-2024.[11] In 2025 the consumer app was rebranded in Chinese as Wenxiaoyan (文小言).[20] For the specific model generations behind it, see ERNIE 4.5, ERNIE X1, and ERNIE 5.0.
Beyond the chatbot, ERNIE is woven into Baidu's existing consumer franchise: it powers generative summaries in Baidu Search, AI features inside Baidu Maps, AI-assisted document creation in Baidu Wenku, and Wenku-style retrieval over the Baidu Drive cloud-storage service.[12] Enterprise deployments run through Qianfan, which Baidu describes as the first one-stop enterprise foundation-model platform in China and which is closely integrated with the PaddlePaddle deep-learning framework.[5][8] On the device side, Baidu announced an integration in which ERNIE provided the AI features available to mainland China users of Samsung Galaxy S24 smartphones in early 2024.[11]
The Qianfan distribution mode is significant because it inverts the architecture in which the chatbot is the visible product and the model is hidden. With Qianfan, Baidu monetises the underlying ERNIE family in three concentric layers: a free tier (ERNIE Speed, ERNIE Lite) intended to acquire developers and small companies; a paid Turbo tier (ERNIE 4.0 Turbo, 4.5 Turbo, X1 Turbo) priced aggressively against DeepSeek and OpenAI to capture cost-sensitive enterprise traffic; and a managed-services AppBuilder layer for system integrators.[5][16] Public Baidu disclosures position ERNIE as the dominant traffic source on the Qianfan platform, with third-party open-weight models (such as Llama derivatives and locally fine-tuned Chinese checkpoints) available as alternatives but not as the default.[5]
In academic and developer benchmark coverage, ERNIE has been compared to BERT, XLNet, GPT-4, GPT-4.5, DeepSeek V3, and DeepSeek-R1 depending on the version. ERNIE 1.0 and 2.0 were the first major Chinese-origin pretrained transformers to outperform BERT on Chinese benchmarks; ERNIE 3.0 was the first non-English-origin model to top SuperGLUE; and ERNIE 4.5 in 2025 has been benchmarked against GPT-4.5 in Baidu's own marketing literature, with the open-source ERNIE-4.5-300B-A47B-Base reported to beat DeepSeek-V3 on 22 of 28 benchmarks.[1][9][3][7][22]
The ERNIE lineage sits within a small group of Chinese-domestic foundation-model families, each with a distinct strategy:
| Company | Model family | Approach |
|---|---|---|
| Baidu | ERNIE / Wenxin | Long-running knowledge-integration line, vertically integrated with search and PaddlePaddle, monetized via Qianfan API and consumer ERNIE Bot.[1][5] |
| Alibaba | Tongyi Qianwen / Qwen | Aggressive open-weights strategy from 2023 onward, with Qwen3 series widely deployed internationally.[18] |
| Tencent | Hunyuan | Closed-source models initially, expanding to open multimodal and video variants. |
| ByteDance | Doubao | Consumer-first chatbot strategy, leveraging ByteDance's content distribution. |
| DeepSeek (Hangzhou) | DeepSeek-R1, DeepSeek V3 | Specialist research startup with aggressive open-weights releases. |
Baidu's defining strategy across this period has been to position ERNIE as the company's flagship under the unified "Wenxin" (文心, "literary heart") family brand and to bind it tightly to its Qianfan cloud and PaddlePaddle framework.[5][8] Reuters, Bloomberg and Financial Times reporting through 2023 and 2024 characterised the company as the early Chinese mover (releasing a ChatGPT competitor before regulators authorised public access) but as one losing ground to newer entrants such as DeepSeek until the March 2025 ERNIE 4.5 and X1 announcements.[7][15] Baidu's response to DeepSeek-R1 in March 2025, including a reasoning model and aggressive price cuts, was widely framed by analysts as a strategic shift toward open-source release and commodity pricing.[4][7][15]
ERNIE is the model layer of a stack that Baidu commercialises in several ways. The lowest layer is the PaddlePaddle deep-learning framework, which Baidu has positioned as a domestic competitor to PyTorch and TensorFlow since 2016, and which provides the training and serving foundation for every ERNIE model.[8] On top sits Baidu AI Cloud, the public-cloud business through which ERNIE compute is sold. The Qianfan Foundation Model Platform layers model-management, fine-tuning, and AppBuilder workflows on top of the cloud.[5] At the application surface, Wenxin Yiyan / ERNIE Bot is the consumer-facing chatbot, while embedded ERNIE features appear in Baidu Search, Maps, Wenku, Drive, GBI, Infoflow, and the Apollo autonomous-driving stack.[12]
This vertical integration is in contrast to the strategies of several peers. Alibaba has emphasised open weights via Qwen and Tongyi Qianwen, distributing checkpoints internationally through Hugging Face. ByteDance's Doubao has been positioned mainly as a consumer chat assistant inside the ByteDance app ecosystem rather than as a developer cloud product. Tencent's Hunyuan family has emphasised multimodal video and 3D variants (such as HunyuanVideo and Hunyuan 3D) and ties to the WeChat ecosystem. DeepSeek is a research-led startup with no consumer app, releasing weights aggressively to attract developer attention. Baidu, by contrast, runs the consumer chatbot, the cloud, the framework, and the model line in-house.[5][8][11]
This integration was reportedly a constraint on Baidu's early agility. Reuters and Bloomberg coverage in 2024 noted that consumer adoption of ERNIE Bot grew steadily but was outpaced in absolute downloads by Doubao, and that the January 2025 release of DeepSeek-R1 forced Baidu to accelerate its own pricing and open-source plans.[14][15] The March 2025 ERNIE 4.5 and ERNIE X1 dual announcement, with simultaneous price cuts and free consumer access, and the June 2025 open-sourcing of ERNIE 4.5, mark the strategic pivot toward a more aggressive distribution posture.[4][7]
ERNIE has been partly open source from the beginning and became substantially more so in 2025. The early research models (ERNIE 1.0, ERNIE 2.0, and ERNIE 3.0 toolkits) were released through the PaddlePaddle ERNIE repository, but Baidu's flagship consumer models (ERNIE 3.5, ERNIE 4.0) were kept closed and served only through ERNIE Bot and Qianfan.[8][12] The decisive change came on June 30, 2025, when Baidu open-sourced the entire ERNIE 4.5 family (10 models, from a 0.3-billion-parameter dense model up to a 424-billion-parameter MoE) under the permissive Apache 2.0 licence on Hugging Face, GitHub, and PaddlePaddle, together with the ERNIEKit and FastDeploy toolkits.[4][22] This reversed Robin Li's long-standing preference for closed models; Li had argued in mid-February 2025 that open source would "spread the technology much faster".[15] The very latest flagship systems (ERNIE X1.1, ERNIE 5.0) are again served primarily as hosted models through ERNIE Bot and Qianfan rather than released as open weights, so ERNIE's openness varies by tier: open-weight for the 4.5 generation, closed-but-cheap for the frontier 5.0 generation.[19][20][21]
Several persistent criticisms apply across the ERNIE history:
Imagine teaching a student to fill in blanks in sentences. Plain BERT hides single letters or word-pieces, so the student can often guess the blank from spelling alone. ERNIE instead hides whole names and phrases ("the capital of France" or "Albert Einstein"), so to fill the blank the student has to actually know facts, not just spelling patterns. That is what "knowledge integration" means. Baidu kept making this student bigger and smarter: first a small Chinese expert (2019), then a 260-billion-parameter giant (2021), then a chatbot you can talk to called ERNIE Bot (2023), and by 2025-2026 a model that can also see pictures, hear audio, and watch video, with the mid-tier 4.5 version given away for free so anyone can download and run it.