Hunyuan is Tencent's flagship brand for its foundation models and AI systems, spanning large language models, video generation, image synthesis, 3D asset creation, and multimodal reasoning. The name, drawn from Chinese cosmological tradition ("primordial chaos" or "first cause"), has become Tencent's umbrella for a broad portfolio of research and production models released between 2023 and the present.
The Hunyuan family ranges from dense billion-parameter instruct models to some of the largest open-source Mixture of Experts architectures ever released, alongside specialized systems for video, 3D, and machine translation. Several Hunyuan models have been released as open weights on Hugging Face and GitHub, making Tencent one of the more prolific open-source contributors among major technology companies in Chinese AI.
Tencent began serious investment in large-scale neural language models years before the generative AI wave of 2023. The company launched a 100-billion-parameter NLP model in 2021 and introduced a trillion-parameter sparse model in 2022, at a time when most public attention was focused on OpenAI and Google. Tencent also established its AI Lab in 2016, which conducted both fundamental and applied research across vision, speech, and language.
The Hunyuan brand was officially introduced on September 7, 2023, when Tencent unveiled the model at its Global Digital Ecosystem Summit in Shenzhen. The initial version had over 100 billion parameters and was pre-trained on more than two trillion tokens, with explicit emphasis on Chinese language capability, logical reasoning, and enterprise integrations. At launch, it was immediately available via API on Tencent Cloud and had already been deployed inside more than 50 Tencent products including Tencent Meeting, Tencent Docs, and advertising platforms.
In early 2026, Tencent announced the dissolution of its standalone AI Lab (founded in 2016), folding its researchers into the Hunyuan large-model team under Chief AI Scientist Yao Shunyu. The reorganization reflected a broader shift toward centralizing AI development around the Hunyuan brand rather than maintaining a separate research organization alongside the product-facing team.
Over the two years following the initial Hunyuan announcement, Tencent released or announced a substantial number of distinct models under the Hunyuan brand. The following sections cover the main language-model lines; separate sections address the video, 3D, and world-generation systems.
Released on November 5, 2024, Hunyuan-Large (also referred to by its technical designation Hunyuan-MoE-A52B) was at the time the largest publicly released Transformer-based Mixture of Experts model, with 389 billion total parameters and 52 billion active parameters during inference.
The model supports context windows up to 256K tokens for the pre-trained checkpoint, with 128K supported in the instruct-tuned version. Pre-training used 7 trillion tokens, of which approximately 1.5 trillion were high-quality synthetic tokens generated to improve performance on mathematics, coding, and multilingual tasks.
The MoE architecture in Hunyuan-Large uses a two-tier expert structure: one shared expert that processes representations for every token, plus 16 specialized experts that are dynamically routed per token. For each forward pass, a single specialized expert is activated alongside the shared one, giving an effective two-expert activation scheme. This design attempts to capture broad domain-general knowledge in the shared component while using the routing mechanism to inject domain-specific information.
Three technical innovations were highlighted in the accompanying paper:
The paper also reported benchmark results. On MMLU, the pre-trained model scored 88.4, outperforming Llama 3.1-405B (85.2) while using far fewer active parameters. On mathematical benchmarks, Hunyuan-Large achieved 69.8 on MATH and 92.8 on GSM8K. Chinese language evaluation on CMMLU also showed strong results, which Tencent attributed to the emphasis on Chinese-language synthetic training data. The instruct model scored 89.9 on MMLU.
Both pre-trained and instruction-tuned versions were released on Hugging Face at tencent/Tencent-Hunyuan-Large, with code on GitHub under Tencent-Hunyuan/Tencent-Hunyuan-Large. The release included the model weights, evaluation scripts, and the technical paper.
Released on February 27, 2025, Hunyuan TurboS is built on what Tencent describes as the world's first ultra-large-scale Hybrid-Transformer-Mamba Mixture of Experts architecture. The model combines 57 Mamba2 layers with 7 Attention layers and 64 Feed-Forward Network layers in an alternating "AMF" and "MF" block pattern. This hybrid design was motivated by Mamba's advantage in sequential decoding speed: linear-time state-space inference rather than the quadratic attention of standard Transformers, which translates to lower first-token latency and higher throughput.
Tencent reported that TurboS doubled output speed relative to comparable models and reduced first-token delay by 44%. The model is designed around an adaptive chain-of-thought mechanism, switching between brief thinking chains for simple queries and extended reasoning traces for more demanding problems, avoiding the latency penalty that comes with full chain-of-thought on every request.
On benchmarks, TurboS scored 89.5 on MMLU and ranked in the top 8 globally on the LMSYS Chatbot Arena with a score of 1356, comparable to Gemini 2.0 Flash and o4-mini. Across 23 automated benchmarks, the model averaged 77.9%. The input API price at launch was 0.8 yuan per million tokens (roughly $0.11), with output at 2 yuan per million tokens ($0.28).
TurboS serves as the pre-training foundation for Hunyuan-T1 and informs the broader TurboS model line.
Released in March 2025, Hunyuan-T1 is Tencent's explicitly reasoning-focused model, positioned to compete with DeepSeek R1 and OpenAI o1/o3. The model is built on the TurboS foundation, inheriting the Hybrid-Transformer-Mamba MoE architecture, and extends it with a post-training process that invested 96.7% of computing power in reinforcement learning rather than supervised fine-tuning.
The T1 team used a curriculum learning strategy during RL training, gradually increasing the difficulty and context length of training problems. They also applied data replay (periodically reintroducing earlier training examples) and periodic policy resetting to stabilize the RL process, reporting that these techniques improved training stability by over 50% compared to standard RL pipelines.
Benchmark performance positioned T1 competitively among reasoning models:
In terms of inference speed, T1 delivered 60-80 tokens per second, roughly twice the throughput of comparable reasoning models under the same deployment conditions. Pricing on Tencent Cloud at launch was 1 yuan per million input tokens and 4 yuan per million output tokens.
Tencent integrated T1 into its Yuanbao AI assistant shortly after release, which contributed to a spike in Yuanbao's daily active users. The model also received updates through mid-2025, adding improvements to multi-turn text comprehension, code generation at the project level, and text quality.
Released on June 27, 2025, Hunyuan-A13B is a more accessible open-source model designed for deployment on hardware that cannot run the 52B-active-parameter Hunyuan-Large. The model has 80 billion total parameters with 13 billion active during inference, making it runnable on systems with 4-5 consumer GPUs or a single high-end server GPU with quantization.
The MoE design uses a fine-grained structure with 1 shared expert and 64 non-shared experts, activating 8 non-shared experts per forward pass (compared to just 1 in Hunyuan-Large). This finer granularity is intended to allow more flexible specialization across diverse task types. The architecture also incorporates Grouped Query Attention for faster decoding.
Like TurboS, Hunyuan-A13B supports dual-mode reasoning, choosing between a fast-thinking mode for routine queries and an extended thinking mode for problems that benefit from multi-step reasoning. The model supports a 256K-token context window natively.
Benchmark results from the Instruct version:
Released variants included Hunyuan-A13B-Pretrain, Hunyuan-A13B-Instruct, Hunyuan-A13B-Instruct-FP8, and Hunyuan-A13B-Instruct-GPTQ-Int4. Docker images for TensorRT-LLM, vLLM, and SGLang deployment were provided. The model is available on Hugging Face at tencent/Hunyuan-A13B-Instruct and on ModelScope.
Released in December 2025, Hunyuan 2.0 (also referred to in some contexts as Tencent HY 2.0) is the next-generation foundation model in the series, available in both Think and Instruct variants. The model uses a Mixture of Experts architecture with 406 billion total parameters and 32 billion active parameters, somewhat smaller in active compute than the 52B of Hunyuan-Large.
Key advances in Hunyuan 2.0 included improvements to pretraining data quality, reinforcement learning strategies, and the context window, which extended to 256K tokens. On the SWE-bench Verified benchmark for real-world software engineering tasks, Hunyuan 2.0 improved from 6.0 (its predecessor) to 53.0, a notable gain that the team attributed to improvements in agentic reasoning. On the IMO-AnswerBench mathematics benchmark, Hunyuan 2.0 Think scored 73.4.
The model became available via Tencent Cloud API at launch. Pricing adjustments followed in early 2026 when the input price was raised from lower promotional rates to 4.505 yuan per thousand tokens for the HY 2.0 Instruct variant.
Released on December 3, 2024, HunyuanVideo is a text-to-video generation model with over 13 billion parameters, released as open source and described at release as the largest open-source video generation model available. The model was evaluated by human raters as performing comparably or better than leading closed-source video systems including Runway Gen-3 and Luma 1.6.
HunyuanVideo uses a Causal 3D Variational Autoencoder to jointly compress spatial and temporal information, encoding video into a compact latent space. Text prompts are encoded by a large language model encoder, and denoising is performed in the latent space using a Diffusion Transformer. Full details and a separate discussion of architecture, training, and downstream variants (including HunyuanVideo-1.5 and HunyuanVideo-I2V) appear in the HunyuanVideo article.
Hunyuan3D-1.0 was released alongside Hunyuan-Large in November 2024 as part of Tencent's November open-source announcement. It is a unified framework for text-to-3D and image-to-3D generation, with a lite version producing a 3D mesh from a single image in approximately 10 seconds and a standard version in 25 seconds.
Hunyuan3D-2.0, released in 2025, split the pipeline into two explicit components: Hunyuan3D-DiT, a diffusion-based shape generation model, and Hunyuan3D-Paint, a texture synthesis model. This decoupling allowed the shape and texture stages to be improved independently and combined flexibly. Hunyuan3D-2.1 added physically-based rendering (PBR) texture synthesis, which models how materials interact with light, producing textures that behave correctly under different lighting conditions for use in games and film production.
Hunyuan 3D 3.0 extended the system toward high-quality production-grade asset generation for objects, with capabilities for more detailed geometry and higher-resolution textures. Tencent also launched the Hunyuan 3D Engine as a globally available API service, offering enterprises integration for game development, e-commerce visualization, advertising, and 3D printing workflows.
HunyuanWorld-1.0, released in 2025, was described by Tencent as the first open-source 3D world generation model, as distinguished from single-object 3D generation. The system generates immersive, explorable, and interactive 3D environments from text prompts or images, integrating panoramic proxy generation, semantic layering, and hierarchical 3D reconstruction to produce 360-degree scene-scale output.
Subsequent releases extended the system: HunyuanWorld-1.1 (WorldMirror) added support for constructing 3D worlds from video or multi-view images, and FlashWorld reduced single-GPU 3DGS-based world generation to 5-10 seconds. HunyuanWorld-1.5 (WorldPlay) focused on real-time interactive world creation. HY-World-2.0 followed in April 2026. The models are hosted on GitHub under Tencent-Hunyuan/HunyuanWorld-1.0 and related repositories.
Across the text-focused Hunyuan models, several architectural patterns recur:
Transformer-Mamba hybrids: Starting with TurboS and continuing through T1 and A13B, Tencent has adopted Hybrid-Transformer-Mamba architectures as a standard approach. Mamba's state-space mechanism processes sequences in linear rather than quadratic time, which matters most during the autoregressive decoding phase where each token is generated sequentially. The hybrid approach retains full attention layers at intervals to preserve the global context modeling that pure Mamba architectures can miss.
Mixture of Experts: All major Hunyuan LLMs from Hunyuan-Large onward use MoE. The specific design has evolved across generations: Hunyuan-Large uses 1 shared + 1-of-16 specialized experts per token (2 active); Hunyuan-A13B uses 1 shared + 8-of-64 specialized per token (9 active); Hunyuan 2.0 uses a different configuration with different total parameter counts. The shared expert pattern (always active, no routing) is a distinctive choice that Tencent has maintained across generations.
Grouped Query Attention: Used in A13B and later models to reduce the memory bandwidth required by the key-value cache during inference, enabling faster serving at given hardware budgets.
Long context: All recent Hunyuan models support 256K-token context windows, which is common among the major Chinese frontier model labs by mid-2025. Earlier instruct variants sometimes supported only 128K.
Dual-mode reasoning: A13B and later models explicitly offer a switch between fast-thinking (short chain-of-thought or none) and slow-thinking (extended reasoning trace) modes, allowing users to trade latency against quality on a per-request basis.
Tencent has not published full details about the composition of Hunyuan training data, but several points have been disclosed across technical reports.
Hunyuan-Large was pre-trained on 7 trillion tokens total, with roughly 1.5 trillion of those being synthetic tokens generated specifically to improve mathematical reasoning, coding ability, and multilingual quality. The use of synthetic data at this scale (over 20% of the pre-training corpus) was described in the paper as significantly larger than in earlier published MoE models at the time.
All Hunyuan models emphasize Chinese-language data, and benchmark results on CMMLU and C-Eval consistently show strong performance in Chinese compared to multilingual English-centric models. Tencent claimed a record score of 86.918 on C-Eval, outperforming GPT-4 on that benchmark's Chinese university entrance exam simulation.
For post-training, the Hunyuan-T1 paper disclosed that 96.7% of compute in the post-training phase went to reinforcement learning. The RL curriculum used mathematics, coding, logical reasoning, and science problems with verifiable answers, allowing scalar reward signals from correctness checking rather than reward models.
Tencent's approach to open-source release with Hunyuan is selective but consistent: the major text-language models from Hunyuan-Large onward have been released with weights and code, while the internal production models (including the cloud-only versions of T1 and earlier versions before open release) remained closed.
Hunyuan-Large was released on Hugging Face in November 2024, and Hunyuan-A13B was released in June 2025, with multiple quantized variants (FP8, GPTQ-Int4) to enable deployment at different hardware tiers. HunyuanVideo, Hunyuan3D-1.0, Hunyuan3D-2.0, Hunyuan3D-2.1, and HunyuanWorld-1.0 were all released with code and weights on GitHub and Hugging Face.
The tencent organization on Hugging Face hosts over 20 model repositories as of mid-2025, spanning text, video, image, 3D, OCR, and translation models. GitHub repositories are organized under the Tencent-Hunyuan organization and include Docker images, evaluation scripts, and training manuals.
The licensing for model weights varies by release. Hunyuan-Large weights carry a custom Hunyuan license. HunyuanVideo and the 3D models use licenses that generally permit research and commercial use with attribution but restrict redistribution of derivatives under certain conditions. Specific licensing terms are documented in the LICENSE files in each repository.
In the broader Chinese AI landscape, the open-source activity from Tencent is second in volume to Alibaba (whose Qwen series has an extensive open-source catalog) but ahead of Baidu, ByteDance, and most other Chinese labs in terms of the scale and diversity of models released publicly.
The following table summarizes benchmark scores across Hunyuan text models where scores have been publicly reported. Not all models have been evaluated on all benchmarks.
| Model | MMLU | MMLU-PRO | MATH-500 | GPQA-Diamond | LiveCodeBench | GSM8K |
|---|---|---|---|---|---|---|
| Hunyuan-Large (pretrain) | 88.4 | - | 69.8 | - | - | 92.8 |
| Hunyuan-Large (instruct) | 89.9 | - | - | - | - | - |
| Hunyuan TurboS | 89.5 | 87.11 | - | - | - | 94.39 |
| Hunyuan-T1 | - | 87.2 | 96.2 | 69.3 | 64.9 | - |
| Hunyuan-A13B-Instruct | 88.17 | - | 94.3 | - | - | 94.39 |
The three most-compared models to Hunyuan in the Chinese AI space are DeepSeek, Qwen, and MiniMax. Each has taken a somewhat different strategic position.
| Dimension | Hunyuan | DeepSeek | Qwen | MiniMax |
|---|---|---|---|---|
| Parent company | Tencent | DeepSeek (partly Liang Wenfeng) | Alibaba | MiniMax |
| Architecture | MoE, Hybrid Transformer-Mamba | MoE (DeepSeekMoE) | Dense and MoE | Dense and MoE |
| Open weights | Yes (selected models) | Yes (most models) | Yes (most Qwen models) | Selected models |
| Reasoning model | Hunyuan-T1 | DeepSeek-R1 | QwQ series | MiniMax-Text-01 |
| Video model | HunyuanVideo | - | - | - |
| 3D model | Hunyuan3D, HunyuanWorld | - | - | - |
| Flagship context | 256K | 128K (V3) | 1M (Qwen-Long) | 1M |
| Chinese specialization | Strong | Strong | Strong | Strong |
DeepSeek generated more international attention in early 2025 with DeepSeek-R1 and V3, in part because of aggressive cost reduction claims and because the DeepSeek-R1 paper was published with detailed methodology. Hunyuan-T1 was positioned explicitly as a response, with Tencent marketing it as competitive with R1 on reasoning benchmarks while offering higher throughput.
Qwen from Alibaba has the largest open-source catalog among Chinese labs, with fine-grained models at every scale from 0.5B to 235B. Hunyuan-A13B was benchmarked directly against Qwen2.5-72B and Qwen3-A22B at release. In Tencent's published comparisons, Hunyuan-A13B scored higher than Qwen2.5-72B on MMLU (88.17 vs. lower) with substantially fewer active parameters.
MiniMax has a smaller public presence but has offered very long-context models (up to 1M tokens) and has focused on multimodal API products. Hunyuan's 256K context is competitive with most models outside of MiniMax's special long-context offering.
Hunyuan models are available through Tencent Cloud's API under the TokenHub pricing system, where input and output tokens are billed separately. Prices have changed over time as competition among Chinese AI providers has pushed costs down and up.
| Model | Input (yuan/M tokens) | Output (yuan/M tokens) | Approx. USD (input) |
|---|---|---|---|
| Hunyuan-T1 | 1.0 | 4.0 | ~$0.14 |
| Hunyuan TurboS | 0.8 | 2.0 | ~$0.11 |
| Hunyuan-A13B | 0.5 | 2.0 | ~$0.07 |
| Hy3 Preview | 1.2 | 4.0 | ~$0.17 |
In March 2026, Tencent Cloud raised prices on Hunyuan 2.0 Instruct substantially, attributing the increase to surging token consumption driven by AI agents. This followed a period of aggressive price cuts across the Chinese AI cloud industry in 2024, when Tencent had cut prices on the Hunyuan-lite model to zero.
Tencent Cloud also offers Hunyuan models via its MaaS (Model-as-a-Service) platform with over 50 industry-specific solution templates across 20 sectors, including finance, e-commerce, gaming, healthcare, and transportation.
Yuanbao AI assistant: The most prominent consumer-facing deployment of Hunyuan models is Tencent's Yuanbao assistant, launched in May 2024. Yuanbao integrated deeply with WeChat, offering functions like summarizing articles from official accounts, AI search, book Q&A within WeChat Reading, and conversational document interaction. By Q2 2025, Yuanbao had reached 41.64 million monthly active users, making it the third-largest AI chatbot in China. By February 2026, it reached 114 million monthly active users.
WeChat ecosystem: Hunyuan powers AI features across WeChat's ecosystem including natural-language mini-program generation (users can create mini-programs by describing them in text), smart assistant features, and content summarization. The system handles over 10 billion agent tool calls per day across the WeChat platform.
Tencent Meeting and Docs: Tencent Meeting uses Hunyuan to generate meeting minutes automatically. Tencent Docs uses it for text creation and document assistance. These integrations were announced at the September 2023 launch and have been updated with newer model versions.
Enterprise cloud services: Through Tencent Cloud, businesses can call Hunyuan APIs for customer service, content generation, coding assistance, and data analysis. Fine-tuning is available for enterprises that want to adapt the model to proprietary domains.
Creative content: HunyuanVideo, Hunyuan3D, and HunyuanWorld serve game studios, film production companies, e-commerce platforms, and advertising agencies that need generated video or 3D assets. Tencent's own game development operations have used Hunyuan3D for asset creation.
Healthcare: Tencent developed an "AI Family Doctor" initiative using Hunyuan as a 24/7 health companion, offering personalized guidance based on user health records and symptom descriptions.
Learning assistants: QQ Browser's QBot uses Hunyuan for educational tutoring, supporting text, image, and voice input to help students with problem-solving and explanation across subjects.
Hunyuan-Large received positive attention from the ML research community when it was released in November 2024. The Deep Learning AI newsletter covered the release under the headline "Hunyuan-Large Outshines Open Competitors with High Benchmark Scores," noting its performance relative to Llama 3.1-405B on several evaluations. The size of the synthetic data component (1.5 trillion tokens) was noted as unusually large for the field at that time.
Hunyuan-T1 was covered in South China Morning Post and multiple international technology publications when it launched in March 2025, with the emphasis largely on its speed advantage over DeepSeek R1 and its aggressive pricing. Analytics Vidhya published an analysis under the headline "China's New Model Hunyuan-T1 Beats GPT 4.5."
On the LMSYS Chatbot Arena, Hunyuan-TurboS ranked as the second-highest Chinese model globally as of its evaluation period, behind DeepSeek and ahead of Qwen variants.
A SuperCLUE evaluation from August 2024 ranked Hunyuan first in 8 out of 11 core capability categories among Chinese models evaluated at that time.
Tencent's Yuanbao growing to 114 million monthly active users by early 2026 represented one of the fastest audience growth trajectories for any AI assistant in China, largely attributed to the combination of Hunyuan model quality improvements and the leverage of WeChat's 1.4 billion active users as a distribution channel.
Hunyuan models, like other large language models, are susceptible to hallucination: generating confident but incorrect statements, especially in domains not well represented in training data. Tencent has not published detailed analysis of hallucination rates across Hunyuan models.
The instruct versions of earlier Hunyuan models (before T1 and A13B) were criticized by some users for being more conservative in responses than competing models, particularly in Chinese-language contexts where content moderation requirements are more stringent. Models deployed inside China operate under Chinese regulations governing AI content, which affects how the models handle politically sensitive queries.
Hunyuan-T1 and earlier models are available only via API (not as open weights), which limits reproducibility for researchers who want to evaluate or fine-tune the reasoning model specifically. Only the large pretrain-and-instruct LLMs (Hunyuan-Large, Hunyuan-A13B) and the multimodal generation models have been fully open-sourced with weights.
Inference costs, while competitive within China, are not necessarily accessible for high-volume international developers given that Tencent Cloud's API is primarily designed for Chinese-market users. Third-party hosting options (SiliconFlow offers Hunyuan-A13B-Instruct) expand access somewhat.
The model's benchmark-to-real-world gap is not independently established for most Hunyuan models. Most published evaluations come from Tencent's own technical reports, and third-party assessments on Arena-style evaluations or independent research benchmarks are limited compared to those available for Llama or Qwen models.