Qwen

Qwen (also called Tongyi Qianwen, Chinese: 通义千问; pinyin: Tōngyì Qiānwèn; literally "to comprehend the meaning, [and to answer] a thousand kinds of questions") is a family of large language models (LLMs) and multimodal models developed by Alibaba Cloud, the cloud computing division of Chinese technology company Alibaba Group.^[1] The name "Qwen" is derived from the Chinese brand Tongyi Qianwen and refers to the large language model family built by Alibaba Cloud's Qwen Team.^[2] As of February 2025, Qwen models have become one of the most widely adopted open-source models globally, with more than 78,000 derivative models developed on Hugging Face based on the Qwen family since it was first open-sourced in 2023, and over 40 million downloads.^[3] All top 10 open-source LLMs on Hugging Face's Open LLM Leaderboard were trained and developed on updated open-source versions of Qwen.^[4]

History

Early Development

Alibaba first launched a beta version of Qwen on April 11, 2023, during the Alibaba Cloud Summit under the name Tongyi Qianwen.^[5] The initial architecture was based on the LLaMA framework developed by Meta AI.^[1] Initially, it was integrated into various Alibaba business applications, including the workplace collaboration tool DingTalk and the voice assistant Tmall Genie.^[6] The model received approval from the Chinese government and was publicly released in September 2023.^[7]

Open Source Release

In a significant move to foster a broader AI ecosystem, Alibaba Cloud began open-sourcing its models in August 2023. The first models released were Qwen-7B and its chat-fine-tuned variant, Qwen-7B-Chat.^[8] This was followed by the release of Qwen-1.8B in November 2023, aimed at low-latency and resource-constrained environments.^[9] In December 2023, Alibaba released the 72B parameter model, which demonstrated performance comparable to leading proprietary models like GPT-3.5 on several benchmarks.^[10]

Development Timeline

Date (UTC)	Generation / model	Key details
2023-04-11	Tongyi Qianwen (beta)	Initial corporate announcement by Alibaba Cloud for a company-scale LLM initiative.^[5]
2023-08-03	Qwen-7B / Qwen-7B-Chat	First broadly distributed open weights via ModelScope and Hugging Face. Quantized INT4 chat variant followed on 2023-08-21.^[11]
2023-09-13	Qwen (public release)	Model approved by Chinese government for public release.^[7]
2023-11-30	Qwen-72B	72-billion parameter model released, competitive with GPT-3.5.^[12]
2024-02-05	Qwen1.5	Models ranging from 0.5B to 110B parameters with 32K context window.^[13]
2024-06-06	Qwen2	Enhanced capabilities with dense and sparse models, up to 72B parameters.^[14]
2024-07-15	Qwen2 (tech report)	Technical report describes sizes from 0.5B to 72B and an MoE variant (57B-A14B) with efficiency optimizations and long-context support.^[15]
2024-09-19	Qwen2.5	18 trillion token training, multilingual support, 128K context.^[16]
2024-09-18	Qwen2.5-Math	Math-specialized instruction models (1.5B/7B/72B).^[17]
2024-09	Qwen2-VL	Vision-language series with dynamic resolution tokenization strategy.^[18]
2024-11-28	QwQ-32B-Preview	Open-source reasoning model designed to compete with OpenAI's o1.^[19]
2025-01-28	Qwen2.5-Max	Large-scale MoE model claiming superiority over DeepSeek V3.^[20]
2025-01-29	Qwen2.5-VL	Enhanced video understanding with temporal reasoning.^[21]
2025-04-28	Qwen3	Thinking/non-thinking modes, enhanced reasoning, 36 trillion tokens.^[22]
2025-07-23	Qwen3-Coder	Open-source coding-focused models emphasizing agentic coding workflows.^[23]
2025-07-24	Qwen-MT	Translation model supporting 92 languages.^[24]
2025-08-04	Qwen-Image	Image generation model with complex text rendering.^[25]
2025-09-05	Qwen3-Max	Flagship hosted model with competitive performance.^[22]
2025-09-11	Qwen3-Next	Optimized model with >10x throughput improvement.^[22]
2025-09-23	Qwen3Guard	Safety guardrail model for real-time moderation.^[26]

Architecture and Technical Features

Core Architecture

Qwen models are based on the Transformer architecture, which is the standard for modern LLMs. Key architectural features include:

Attention Mechanism: Uses self-attention mechanism. The Qwen2 series introduced Group Query Attention (GQA) in larger models to improve inference speed and reduce memory usage.^[14]
Tokenizer: Custom tokenizer with over 150,000 token vocabulary size, efficiently representing text from multiple languages and reducing token count for non-English text.^[13]
Position Embeddings: Evolution from Absolute Position Embeddings (ALiBi) to Rotary Position Embeddings (RoPE) for better long-context performance, with multimodal variants using M-RoPE (Multimodal Rotary Position Embedding).^[18]
Architecture Types: Both dense and Mixture of Experts (MoE) variants, with MoE models activating only a subset of parameters per token for efficiency.^[22]

Training Data Scale

The evolution of training data demonstrates significant scaling:

Qwen: Initial training on multilingual datasets
Qwen2: 7 trillion tokens^[15]
Qwen2.5: 18 trillion tokens^[27]
Qwen2.5-Max: Over 20 trillion tokens^[20]
Qwen3: 36 trillion tokens in 119 languages and dialects^[1]

The pre-training data includes high-quality Chinese language data, multilingual text, code, mathematics, and multimodal data, supporting over 29 core languages and up to 92 languages in translation variants.

Model Sizes and Variants

Qwen3 Dense Models

Qwen3 MoE Models

Qwen3-30B-A3B (30 billion total parameters, 3 billion activated)
Qwen3-235B-A22B (235 billion total parameters, 22 billion activated)

Thinking and Non-Thinking Modes

Qwen3 introduces a hybrid approach to problem-solving with two distinct modes:^[22]

Thinking Mode: The model takes time to reason step by step before delivering the final answer, ideal for complex problems requiring deeper thought
Non-Thinking Mode: Provides quick, near-instant responses suitable for simpler questions where speed is prioritized
Thinking Budget: Users can control how much "thinking" the model performs based on the task at hand

Multimodal Capabilities

Qwen-VL Series

The Qwen-VL (Vision-Language) series represents Qwen's multimodal models that can process both text and images:

Qwen2-VL

Released in August 2024, featuring:^[28]

Naive Dynamic Resolution mechanism for processing images of varying resolutions
Multimodal Rotary Position Embedding (M-RoPE) for effective fusion of positional information
Support for videos over 20 minutes
Understanding of images at various resolutions and ratios
Fine-grained resolution and document parsing

Qwen2.5-VL

Released in January 2025, featuring:^[21]

Enhanced OCR recognition capabilities
Multi-scenario, multi-language, and multi-orientation text recognition
Visual localization with bounding boxes
Structured output generation for documents, forms, and tables
Temporal reasoning for video understanding
Dynamic resolution and multi-image analysis

Qwen3-VL

The latest vision-language model featuring:^[29]

Visual Agent capabilities for PC/mobile GUI operation
Advanced spatial perception and 3D grounding
DeepStack architecture for fine-grained visual detail capture
Text-Timestamp Alignment for precise video event localization

Qwen2.5-Omni

End-to-end multimodal model with unique capabilities:^[30]

Thinker-Talker architecture for simultaneous text and speech generation
Real-time voice and video chat support
TMRoPE (Time-aligned Multimodal RoPE) for synchronizing video and audio timestamps
Processing of text, images, videos, and audio inputs
Bilingual support (English/Chinese) with low-latency interaction

QVQ (Visual Reasoning Model)

QVQ-72B-Preview is an experimental research model for enhanced visual reasoning:^[3]

70.3% on MMMU (Multimodal Massive Multi-task Understanding) benchmark
Superior performance on MathVision and OlympiadBench
Advanced multidisciplinary understanding and reasoning abilities

Qwen-Audio

Large language audio models supporting:^[31]

Cross-modal processing between text and audio
Support for 30+ languages
Speech recognition and sound interpretation
Processing of audio up to 30 minutes

Specialized Models

Variant	Release Date	Parameters	Focus	Key Features
Qwen2.5-Coder	September 2024	0.5B to 32B	Coding	Code generation, reasoning, fixing; 88.2 on HumanEval; state-of-the-art open-source code model^[16]
Qwen2.5-Math	September 2024	1.5B to 72B	Mathematics	Mathematical reasoning with CoT, PoT, and TIR; 72.0 on MathVista; surpasses GPT-4o^[17]
Qwen-MT	July 2025	Based on Qwen3	Translation	92 languages covering 95% of global population; reinforcement learning for accuracy^[24]
Qwen3Guard	September 2025	Based on Qwen3	Safety	Real-time moderation, risk classification; SOTA on multilingual safety benchmarks^[26]
QwQ-32B-Preview	November 2024	32B	Reasoning	Enhanced multi-step reasoning; outperforms o1 in benchmarks; Apache 2.0 license^[32]
Qwen-Image	August 2025	20B	Image Generation	Complex text rendering, multi-line layouts, high visual quality^[25]
Qwen-Image-Edit	August 2025	Based on Qwen-Image	Image Editing	Precise text editing, semantic/appearance control^[25]

Performance and Benchmarks

Qwen3 Performance

Qwen3-235B-A22B achieves competitive results compared to other top-tier models:^[22]

Benchmark	Qwen3-235B-A22B	Qwen3-30B-A3B	QwQ-32B
Arena-Hard	95.6	91.0	89.5
AIME 2024	85.7	80.4	Superior
LiveCodeBench	70.7	-	Strong
CodeForces Elo	-	1974	-

Comparative Benchmarks

Model	MMLU-Redux (EN)	HumanEval (EN)	MathVista (EN)	Notes
Qwen3-235B-A22B	88.5	N/A	N/A	Highest in series for general knowledge
Qwen2.5-72B-Instruct	68.8 (MMLU-Pro)	N/A	N/A	Instruction-tuned
Qwen2.5-Coder-32B	N/A	88.2	N/A	Coding SOTA
Qwen2.5-Math-32B	N/A	N/A	72.0	Math reasoning champion
Qwen2.5-Max	Comparable to GPT-4o	Surpasses DeepSeek V3	N/A	Overall competitive

Additional benchmarks include CMMLU for Chinese language understanding, GSM8K for math word problems, and BFCL for tool and function-calling capabilities.^[33]

Capabilities

Qwen models support a comprehensive array of tasks:

Multilingual Processing: Core models handle 29 languages, with Qwen-MT extending to 92 languages covering 95% of the global population^[24]
Long-Context Understanding: Up to 128K tokens, enabling analysis of extensive documents, codebases, or conversations
Coding and Mathematics: Specialized models achieve state-of-the-art results with Qwen2.5-Coder scoring 88.2 on HumanEval and Qwen2.5-Math achieving 72.0 on MathVista
Multimodal Tasks: Image generation/editing, video grounding, audio transcription, and multi-modal reasoning
Safety and Moderation: Qwen3Guard provides real-time detection with categorized risk levels
Agentic and Reasoning: Models like QwQ-32B support advanced chain-of-thought reasoning, tool use, and multi-step tasks
Structured Data Analysis: Enhanced capabilities for processing tables, forms, and structured documents
Real-time Interaction: Support for low-latency voice and video chat through Omni models

Deployment and Accessibility

Open Source Availability

Most Qwen models are released under the Apache 2.0 license, making them available for both research and commercial use.^[22] However, some larger models like the original Qwen-72B were released under a more restrictive "Tongyi Qianwen LICENSE AGREEMENT" that requires approval for commercial applications with over 100 million monthly active users.^[34]

Models are distributed through:

Hugging Face^[2]
ModelScope^[2]
GitHub repositories^[35]

Commercial Services

Alibaba Cloud provides commercial APIs through:^[36]

Alibaba Cloud Model Studio
DashScope API service
Qwen Chat web interface (chat.qwen.ai)

Deployment Frameworks

Qwen models support deployment through various frameworks:^[35]

vLLM for high-throughput inference
SGLang for large-scale deployment
TensorRT-LLM for NVIDIA GPU optimization
Ollama for local deployment
llama.cpp for CPU and GPU inference
Integration with popular AI frameworks

Applications

Qwen powers diverse applications including:

Enterprise AI Solutions: Document analysis, customer service automation, and business intelligence
Software Development: Code generation, debugging, and agentic coding workflows through Qwen3-Coder
Education: Personalized tutoring, especially in mathematics and programming
Healthcare: Medical document analysis and research assistance
E-commerce: Product descriptions, customer support, and recommendation systems
Creative Content: Story writing, article generation, and image creation
Research: Academic paper analysis and scientific computing

Fine-tuned versions, such as "Liberated Qwen" by Abacus AI, remove content restrictions for specialized use cases.^[1] The models support real-time interactions and are used in agentic frameworks for autonomous tasks.

Languages Supported

Qwen models provide extensive multilingual support, with Qwen3 supporting 119 languages and dialects.^[1] Core language support includes:

Template:Div col

Chinese (Simplified and Traditional)
English
French
Spanish
Portuguese
German
Italian
Russian
Japanese
Korean
Vietnamese
Thai
Arabic
Turkish
Indonesian
Dutch
Polish
Swedish
Hindi
Hebrew
Finnish
Danish
Norwegian
Czech
Hungarian
Romanian
Greek
Bulgarian
Ukrainian

Template:Div col end

Limitations and Considerations

While Qwen models demonstrate strong capabilities, they have known limitations:^[37]

Language mixing: Models may unexpectedly switch between languages during generation
Circular reasoning: Can get stuck in repetitive reasoning loops, particularly in complex multi-step problems
Safety concerns: Require stronger safety features and guardrails for production deployment
Performance gaps: While strong in math and coding, improvements needed in common sense reasoning
Context limitations: Although supporting long contexts, performance may degrade with extremely long inputs
Computational requirements: Larger models require significant GPU resources for deployment

Impact and Adoption

Since its initial open-source release in 2023, the Qwen family has achieved significant adoption milestones:

Over 40 million downloads across platforms^[3]
More than 78,000 derivative models developed on Hugging Face^[3]
Powers all top 10 open-source LLMs on Hugging Face's Open LLM Leaderboard as of February 2025^[4]
Integrated into numerous commercial applications across Alibaba's ecosystem
Adopted by enterprises globally for various AI applications

The development reflects Alibaba's ambition to position Qwen as a foundational "operating system" for AI, akin to Android in mobile computing.^[38]

References

External links

[wikipedia-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 https://en.wikipedia.org/wiki/Qwen

[huggingface-2] 2.0 ^2.1 ^2.2 https://huggingface.co/Qwen

[qvq-alibaba-3] 3.0 ^3.1 ^3.2 ^3.3 https://www.alibabacloud.com/blog/alibaba-cloud-unveils-new-research-model-for-enhanced-visual-reasoning_601914

[scmp-4] 4.0 ^4.1 https://www.scmp.com/tech/big-tech/article/3298233/alibabas-qwen-powers-top-10-open-source-models-china-ai-know-how-goes-beyond-deepseek

[cnbc_11_April_2023-5] 5.0 ^5.1 https://www.cnbc.com/2023/04/11/alibaba-launches-tongyi-qianwen-ai-model-similar-to-gpt.html

[reuters-initial-launch-6] ttps://www.reuters.com/technology/alibaba-cloud-launches-its-chatgpt-rival-tongyi-qianwen-2023-04-11/

[SCMP_13_September_2023-7] 7.0 ^7.1 https://www.scmp.com/tech/big-tech/article/3234661/alibabas-ai-model-tongyi-qianwen-approved-public-release-china

[reuters_August_3_2023-8] ttps://www.reuters.com/technology/alibaba-launches-open-source-ai-model-qwen-2023-08-03/

[huggingface-1.8b-9] ttps://huggingface.co/Qwen/Qwen-1_8B

[chinadaily_2023-12-01-10] ttps://www.chinadaily.com.cn/a/202312/01/WS6569a8a7a31090682a5f5b5e.html

[gh_release2023-11] ttps://github.com/QwenLM/Qwen

[venturebeat-72b-12] ttps://venturebeat.com/ai/alibaba-clouds-new-qwen-72b-llm-tops-open-source-rankings-and-rivals-gpt-4/

[qwen-1.5-blog-13] 13.0 ^13.1 https://qwenlm.github.io/blog/qwen1.5/

[qwen2-blog-14] 14.0 ^14.1 https://qwenlm.github.io/blog/qwen2/

[qwen2_report-15] 15.0 ^15.1 https://arxiv.org/abs/2407.10671

[qwen2.5-blog-16] 16.0 ^16.1 https://qwenlm.github.io/blog/qwen2.5/

[qwen25_math-17] 17.0 ^17.1 https://arxiv.org/html/2409.12122v1

[qwen2_vl-18] 18.0 ^18.1 https://arxiv.org/abs/2409.12191

[qwq-venturebeat-19] ttps://venturebeat.com/ai/alibaba-releases-qwen-with-questions-an-open-reasoning-model-that-beats-o1-preview/

[qwen2.5-max-20] 20.0 ^20.1 https://qwenlm.github.io/blog/qwen2.5-max/

[qwen2.5-vl-21] 21.0 ^21.1 https://qwenlm.github.io/blog/qwen2.5-vl/

[qwen3-blog-22] 22.0 ^22.1 ^22.2 ^22.3 ^22.4 ^22.5 ^22.6 https://qwenlm.github.io/blog/qwen3/

[reuters_q3coder-23] ttps://www.reuters.com/world/china/alibaba-launches-open-source-ai-coding-model-touted-its-most-advanced-date-2025-07-23/

[qwen-mt-24] 24.0 ^24.1 ^24.2 https://qwenlm.github.io/blog/qwen-mt/

[qwen-image-25] 25.0 ^25.1 ^25.2 https://qwenlm.github.io/blog/qwen-image/

[qwen3guard-26] 26.0 ^26.1 https://qwenlm.github.io/blog/qwen3guard/

[qwen2.5-paper-27] ttps://arxiv.org/abs/2412.15115

[qwen2-vl-paper-28] ttps://arxiv.org/abs/2409.12191

[qwen3-vl-github-29] ttps://github.com/QwenLM/Qwen3-VL

[qwen-omni-30] ttps://github.com/QwenLM/Qwen2.5-Omni

[alibaba-cloud-doc-31] ttps://www.alibabacloud.com/help/en/model-studio/what-is-qwen-llm

[qwq-alibaba-32] ttps://www.alibabacloud.com/blog/alibaba-cloud-unveils-open-source-ai-reasoning-model-qwq-and-new-image-editing-tool_601813

[qwen-leaderboard-33] ttps://qwenlm.github.io/share/leaderboard/

[72b-license-info-34] ttps://www.infoq.com/news/2023/12/alibaba-qwen-72b-llm/

[qwen3-github-35] 35.0 ^35.1 https://github.com/QwenLM/Qwen3

[alibaba-pricing-36] ttps://www.alibabacloud.com/help/en/model-studio/models

[qwq-datacamp-37] ttps://www.datacamp.com/blog/qwq-32b-preview

[reddit-roadmap-38] ttps://www.reddit.com/r/LocalLLaMA/comments/1nq182d/alibaba_just_unveiled_their_qwen_roadmap_the/

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]