Qwen (also called Tongyi Qianwen, Chinese: 通义千问; pinyin: Tōngyì Qiānwèn; literally "to comprehend the meaning, [and to answer] a thousand kinds of questions") is a family of large language models (LLMs) and multimodal models developed by Alibaba Cloud, the cloud computing division of Chinese technology company Alibaba Group.[1] The name "Qwen" is derived from the Chinese brand Tongyi Qianwen and refers to the large language model family built by Alibaba Cloud's Qwen Team.[2] As of February 2025, Qwen models have become one of the most widely adopted open-source models globally, with more than 78,000 derivative models developed on Hugging Face based on the Qwen family since it was first open-sourced in 2023, and over 40 million downloads.[3] All top 10 open-source LLMs on Hugging Face's Open LLM Leaderboard were trained and developed on updated open-source versions of Qwen.[4]
History
Early Development
Alibaba first launched a beta version of Qwen on April 11, 2023, during the Alibaba Cloud Summit under the name Tongyi Qianwen.[5] The initial architecture was based on the LLaMA framework developed by Meta AI.[1] Initially, it was integrated into various Alibaba business applications, including the workplace collaboration tool DingTalk and the voice assistant Tmall Genie.[6] The model received approval from the Chinese government and was publicly released in September 2023.[7]
Open Source Release
In a significant move to foster a broader AI ecosystem, Alibaba Cloud began open-sourcing its models in August 2023. The first models released were Qwen-7B and its chat-fine-tuned variant, Qwen-7B-Chat.[8] This was followed by the release of Qwen-1.8B in November 2023, aimed at low-latency and resource-constrained environments.[9] In December 2023, Alibaba released the 72B parameter model, which demonstrated performance comparable to leading proprietary models like GPT-3.5 on several benchmarks.[10]
Development Timeline
Date (UTC)
Generation / model
Key details
2023-04-11
Tongyi Qianwen (beta)
Initial corporate announcement by Alibaba Cloud for a company-scale LLM initiative.[5]
2023-08-03
Qwen-7B / Qwen-7B-Chat
First broadly distributed open weights via ModelScope and Hugging Face. Quantized INT4 chat variant followed on 2023-08-21.[11]
Image generation model with complex text rendering.[25]
2025-09-05
Qwen3-Max
Flagship hosted model with competitive performance.[22]
2025-09-11
Qwen3-Next
Optimized model with >10x throughput improvement.[22]
2025-09-23
Qwen3Guard
Safety guardrail model for real-time moderation.[26]
Architecture and Technical Features
Core Architecture
Qwen models are based on the Transformer architecture, which is the standard for modern LLMs. Key architectural features include:
Attention Mechanism: Uses self-attention mechanism. The Qwen2 series introduced Group Query Attention (GQA) in larger models to improve inference speed and reduce memory usage.[14]
Tokenizer: Custom tokenizer with over 150,000 token vocabulary size, efficiently representing text from multiple languages and reducing token count for non-English text.[13]
Position Embeddings: Evolution from Absolute Position Embeddings (ALiBi) to Rotary Position Embeddings (RoPE) for better long-context performance, with multimodal variants using M-RoPE (Multimodal Rotary Position Embedding).[18]
Architecture Types: Both dense and Mixture of Experts (MoE) variants, with MoE models activating only a subset of parameters per token for efficiency.[22]
Training Data Scale
The evolution of training data demonstrates significant scaling:
Qwen3: 36 trillion tokens in 119 languages and dialects[1]
The pre-training data includes high-quality Chinese language data, multilingual text, code, mathematics, and multimodal data, supporting over 29 core languages and up to 92 languages in translation variants.
Model Sizes and Variants
Qwen3 Dense Models
Qwen3-0.6B
Qwen3-1.7B
Qwen3-4B
Qwen3-8B
Qwen3-14B
Qwen3-32B
Qwen3 MoE Models
Qwen3-30B-A3B (30 billion total parameters, 3 billion activated)
Qwen3-235B-A22B (235 billion total parameters, 22 billion activated)
Thinking and Non-Thinking Modes
Qwen3 introduces a hybrid approach to problem-solving with two distinct modes:[22]
Thinking Mode: The model takes time to reason step by step before delivering the final answer, ideal for complex problems requiring deeper thought
Non-Thinking Mode: Provides quick, near-instant responses suitable for simpler questions where speed is prioritized
Thinking Budget: Users can control how much "thinking" the model performs based on the task at hand
Multimodal Capabilities
Qwen-VL Series
The Qwen-VL (Vision-Language) series represents Qwen's multimodal models that can process both text and images:
Additional benchmarks include CMMLU for Chinese language understanding, GSM8K for math word problems, and BFCL for tool and function-calling capabilities.[33]
Capabilities
Qwen models support a comprehensive array of tasks:
Multilingual Processing: Core models handle 29 languages, with Qwen-MT extending to 92 languages covering 95% of the global population[24]
Long-Context Understanding: Up to 128K tokens, enabling analysis of extensive documents, codebases, or conversations
Coding and Mathematics: Specialized models achieve state-of-the-art results with Qwen2.5-Coder scoring 88.2 on HumanEval and Qwen2.5-Math achieving 72.0 on MathVista
Multimodal Tasks: Image generation/editing, video grounding, audio transcription, and multi-modal reasoning
Safety and Moderation: Qwen3Guard provides real-time detection with categorized risk levels
Agentic and Reasoning: Models like QwQ-32B support advanced chain-of-thought reasoning, tool use, and multi-step tasks
Structured Data Analysis: Enhanced capabilities for processing tables, forms, and structured documents
Real-time Interaction: Support for low-latency voice and video chat through Omni models
Deployment and Accessibility
Open Source Availability
Most Qwen models are released under the Apache 2.0 license, making them available for both research and commercial use.[22] However, some larger models like the original Qwen-72B were released under a more restrictive "Tongyi Qianwen LICENSE AGREEMENT" that requires approval for commercial applications with over 100 million monthly active users.[34]
Enterprise AI Solutions: Document analysis, customer service automation, and business intelligence
Software Development: Code generation, debugging, and agentic coding workflows through Qwen3-Coder
Education: Personalized tutoring, especially in mathematics and programming
Healthcare: Medical document analysis and research assistance
E-commerce: Product descriptions, customer support, and recommendation systems
Creative Content: Story writing, article generation, and image creation
Research: Academic paper analysis and scientific computing
Fine-tuned versions, such as "Liberated Qwen" by Abacus AI, remove content restrictions for specialized use cases.[1] The models support real-time interactions and are used in agentic frameworks for autonomous tasks.
Languages Supported
Qwen models provide extensive multilingual support, with Qwen3 supporting 119 languages and dialects.[1] Core language support includes:
Template:Div col
Chinese (Simplified and Traditional)
English
French
Spanish
Portuguese
German
Italian
Russian
Japanese
Korean
Vietnamese
Thai
Arabic
Turkish
Indonesian
Dutch
Polish
Swedish
Hindi
Hebrew
Finnish
Danish
Norwegian
Czech
Hungarian
Romanian
Greek
Bulgarian
Ukrainian
Template:Div col end
Limitations and Considerations
While Qwen models demonstrate strong capabilities, they have known limitations:[37]
Language mixing: Models may unexpectedly switch between languages during generation
Circular reasoning: Can get stuck in repetitive reasoning loops, particularly in complex multi-step problems
Safety concerns: Require stronger safety features and guardrails for production deployment
Performance gaps: While strong in math and coding, improvements needed in common sense reasoning
Context limitations: Although supporting long contexts, performance may degrade with extremely long inputs
Computational requirements: Larger models require significant GPU resources for deployment
Impact and Adoption
Since its initial open-source release in 2023, the Qwen family has achieved significant adoption milestones: