Tencent Hunyuan Hy3
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,664 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,664 words
Add missing citations, update stale details, or suggest a clearer explanation.
Tencent Hunyuan Hy3 (released as Hy3-preview) is an open-weights large language model published by Tencent on April 23, 2026. It is a Mixture-of-Experts (MoE) reasoning model and agent model with roughly 295 billion total parameters and about 21 billion active per token, and it was the first major model to come out of a rebuild of Tencent's pre-training and reinforcement-learning systems that began in February 2026.[1][2][3] Tencent positions it as a cost-efficient open model aimed at practical work: complex reasoning, instruction following, coding, and multi-step agent tasks rather than chart-topping leaderboard runs.[2][4]
The "Hy" in the name is Tencent's shortened overseas brand for Hunyuan. In 2026 the company simplified its international model brand from "Hunyuan" to "Tencent HY," so "Hy3" reads as the third major generation of the Hunyuan family.[5] The repositories and model cards still sit under the Tencent-Hunyuan organization, and Chinese-language coverage refers to the same release as the Hunyuan Hy3 preview.[1][3]
Hy3-preview is a text in, text out language model trained to do extended reasoning before answering. Tencent describes it as a fused fast-and-slow-thinking model, meaning a single model that can answer quickly on easy prompts and spend more compute reasoning step by step on hard ones, rather than shipping a separate "thinking" variant.[1][3] It ships in two public checkpoints: an instruction-tuned model (tencent/Hy3-preview) and a pre-trained base model (tencent/Hy3-preview-Base).[6]
The word "preview" is deliberate. Tencent released the model as an early, still-improving snapshot of the rebuilt Hunyuan line rather than a finished flagship. Yao Shunyu, the company's chief AI scientist, described it as the first step in rebuilding the Hunyuan model line, and said Tencent is "continuously expanding the scale of our pre-training and reinforcement learning efforts to push the boundaries of model intelligence."[2][4] Yao previously worked at OpenAI and is known in the research community for the ReAct framework that interleaves reasoning and actions for agents, which fits the model's agent focus.[4]
Hunyuan is Tencent's in-house family of foundation models, and the lineup grew quickly across 2024 and 2025 before the Hy3 rebuild.
| Model | Released | Type | Notes |
|---|---|---|---|
| Hunyuan Large | Nov 2024 | MoE text | Early open MoE release, around 389B total / 52B active |
| Hunyuan 3D | Nov 2024 | 3D generation | Open-sourced 3D asset model |
| HunyuanVideo | Dec 2024 | Text to video | Open video generation model |
| Hunyuan-T1 | Mar 2025 | Reasoning | Tencent's first deep-thinking model |
| Hunyuan-A13B | Jun 2025 | MoE reasoning | 80B total / 13B active, fine-grained MoE |
| Hunyuan 2.0 (Hy2) | Dec 2025 | MoE | Prior generation before the rebuild |
| Hunyuan Hy3 preview | Apr 2026 | MoE reasoning/agent | 295B total / 21B active, first post-rebuild model |
Hy3 is the successor to the Hunyuan 2.0 generation (referred to as Hy2 in some coverage). Tencent frames it as a clean break: starting in February 2026 the team rebuilt its pre-training and reinforcement-learning infrastructure with a stated focus on systematic capability, honest evaluation, and cost-effectiveness, then trained this first post-rebuild model in under three months.[2][3][4] Earlier Hunyuan models such as Hunyuan-A13B remain separate, smaller releases and are not the same model as Hy3.
Hy3-preview uses a sparse MoE transformer. Of the roughly 295 billion total parameters, only about 21 billion are active for any given token, which is what keeps inference cheap relative to a dense model of similar quality.[1][6] The published configuration lists the following.
| Property | Value |
|---|---|
| Total parameters | ~295B |
| Active parameters per token | ~21B |
| Multi-token prediction (MTP) layer | 3.8B parameters, 1 layer |
| Transformer layers | 80 (excluding the MTP layer) |
| Experts | 192 routed experts, top-8 activated, plus shared experts |
| Attention | Grouped-query attention, 64 query heads over 8 key/value heads, head dim 128 |
| Hidden size | 4,096 |
| Intermediate size | 13,312 |
| Vocabulary | 120,832 tokens |
| Context length | 256K tokens |
| Precision | BF16 |
Two design choices stand out. First, the model carries a dedicated multi-token prediction (MTP) layer of about 3.8 billion parameters that predicts more than one token at a time. This enables speculative decoding for faster generation, and it is wired into serving stacks such as vLLM.[6] Second, reporting on the architecture describes a differentiated expert-size design, where experts are not all the same width and tokens of varying difficulty can be routed to experts with different capacities, in contrast to uniform-expert MoE layers.[7] The 256K context window puts it in the same range as other 2026 frontier open models for long-document and long-agent-trajectory work.[1][6]
The pitch for Hy3 is less about a single benchmark and more about being useful inside real agent loops. Tencent reports the model was deployed across its own products before the public release, including the Yuanbao assistant, the CodeBuddy and WorkBuddy coding agents, the ima note tool, Tencent Docs, and the game Peacekeeper Elite.[1][3]
The product numbers Tencent cites come from those deployments. The company says the rebuilt model cut overall reasoning efficiency cost by about 40 percent versus the previous generation, reduced time-to-first-token on CodeBuddy and WorkBuddy by 54 percent, and cut end-to-end response time by 47 percent, while reporting a 99.99 percent task success rate and support for agent workflows running up to 495 steps.[1][2] For Tencent Docs' AI slide-generation feature, it reports a 20 percent increase in generation success rate.[1] These are vendor-reported figures from internal products, so they are best read as Tencent's own measurements rather than independent results.
Tencent has been openly skeptical of public leaderboards, saying it moved toward self-built tests, human review, and product beta testing because public benchmarks can be gamed.[4] That said, the model cards and coverage do report standard numbers. The instruction-tuned model's published scores include the following.
| Benchmark | Hy3-preview (Instruct) |
|---|---|
| SWE-bench Verified (coding agent) | 74.4% |
| Terminal-Bench 2.0 (terminal agent) | 54.4% |
| BrowseComp (search agent) | 67.1% |
| GPQA Diamond (graduate science QA) | 87.2 |
| Humanity's Last Exam (HLE) | ~30 |
Sources: model card and reporting.[6][8][9]
On STEM and reasoning, Tencent highlights strong results on hard science and math tasks such as a FrontierScience-Olympiad set and IMOAnswerBench, plus what it calls excellent results on the Tsinghua Qiuzhen College math PhD qualifying exam (spring 2026) and the China High School Biology Olympiad (CHSBO 2025), though it does not publish single headline numbers for all of these.[1][6] The pre-trained base model posts results including roughly 95.4% on GSM8K (4-shot), 76.3% on MATH (4-shot), 87.4% on MMLU (5-shot), 65.8% on MMLU-Pro, and 96.0% on ARC-Challenge, benchmarked against the base models of Kimi-K2, DeepSeek-V3, and GLM-4.5.[6]
The headline coding result, about 74 on SWE-bench Verified, is the figure Tencent leans on, and it is competitive for an open model of this size. It still trails the strongest proprietary frontier systems of the period: coverage placing Hy3 next to closed models reports roughly 80.8% for Claude Opus 4.6 and 78.6% for GPT-5.4 on the same test, so Hy3 is closing the gap rather than leading it.[8] Against open competitors it is described as roughly competitive with GLM-5.[8]
Tencent open-sourced Hy3-preview's weights on the same day it announced the model. The checkpoints are available on Hugging Face at tencent/Hy3-preview and tencent/Hy3-preview-Base, on ModelScope, and on GitCode, with code and documentation in the Tencent-Hunyuan/Hy3-preview GitHub repository.[1][6] The model is also offered through Tencent Cloud's TokenHub API and was listed on OpenRouter with a two-week free-access window at launch.[2]
The weights are released under the Tencent Hy Community License Agreement, a custom license rather than a standard permissive one such as Apache 2.0 or MIT.[6] As with other Chinese open-model licenses, that means the weights are downloadable and usable but subject to Tencent's own terms, so commercial users should read the agreement rather than assume an off-the-shelf open-source grant.
Cost is central to the Hy3 story. With only about 21 billion active parameters and the MTP-based speculative decoding, the model is cheap to serve for its quality tier, which is also reflected in Tencent Cloud's launch pricing: TokenHub listed Hy3-preview at 1.2 yuan per million input tokens, 0.4 yuan per million for cached input, and 4 yuan per million output tokens, in the range of roughly $0.18, $0.06, and $0.59 per million tokens at the time.[2][4] That places it among the aggressively priced open frontier models of 2026 alongside the DeepSeek and Qwen families.
The broader context is Tencent's open-weights strategy. By shipping a capable agent-oriented model under a community license, integrating it across its own product surface, and competing largely on price and practical reliability, Tencent is following the same playbook that made other Chinese labs influential in the open-model ecosystem.[4][8] Hy3-preview's "preview" label and Yao Shunyu's framing of it as a first step suggest Tencent intends further releases on the rebuilt infrastructure, so the model is best understood as the opening move of a new Hunyuan generation rather than its final form.[2][4]