Qwen

From AI Wiki

Template:Infobox LLM

Qwen logo1.png

Qwen (also called Tongyi Qianwen, Chinese: 通义千问; pinyin: Tōngyì Qiānwèn; literally "to comprehend the meaning, [and to answer] a thousand kinds of questions") is a family of large language models (LLMs) and multimodal models developed by Alibaba Cloud, the cloud computing division of Chinese technology company Alibaba Group.[1] The name "Qwen" is derived from the Chinese brand Tongyi Qianwen and refers to the large language model family built by Alibaba Cloud's Qwen Team.[2] As of February 2025, Qwen models have become one of the most widely adopted open-source models globally, with more than 78,000 derivative models developed on Hugging Face based on the Qwen family since it was first open-sourced in 2023, and over 40 million downloads.[3] All top 10 open-source LLMs on Hugging Face's Open LLM Leaderboard were trained and developed on updated open-source versions of Qwen.[4]

History

Early Development

Alibaba first launched a beta version of Qwen on April 11, 2023, during the Alibaba Cloud Summit under the name Tongyi Qianwen.[5] The initial architecture was based on the LLaMA framework developed by Meta AI.[1] Initially, it was integrated into various Alibaba business applications, including the workplace collaboration tool DingTalk and the voice assistant Tmall Genie.[6] The model received approval from the Chinese government and was publicly released in September 2023.[7]

Open Source Release

In a significant move to foster a broader AI ecosystem, Alibaba Cloud began open-sourcing its models in August 2023. The first models released were Qwen-7B and its chat-fine-tuned variant, Qwen-7B-Chat.[8] This was followed by the release of Qwen-1.8B in November 2023, aimed at low-latency and resource-constrained environments.[9] In December 2023, Alibaba released the 72B parameter model, which demonstrated performance comparable to leading proprietary models like GPT-3.5 on several benchmarks.[10]

Development Timeline

Date (UTC) Generation / model Key details
2023-04-11 Tongyi Qianwen (beta) Initial corporate announcement by Alibaba Cloud for a company-scale LLM initiative.[5]
2023-08-03 Qwen-7B / Qwen-7B-Chat First broadly distributed open weights via ModelScope and Hugging Face. Quantized INT4 chat variant followed on 2023-08-21.[11]
2023-09-13 Qwen (public release) Model approved by Chinese government for public release.[7]
2023-11-30 Qwen-72B 72-billion parameter model released, competitive with GPT-3.5.[12]
2024-02-05 Qwen1.5 Models ranging from 0.5B to 110B parameters with 32K context window.[13]
2024-06-06 Qwen2 Enhanced capabilities with dense and sparse models, up to 72B parameters.[14]
2024-07-15 Qwen2 (tech report) Technical report describes sizes from 0.5B to 72B and an MoE variant (57B-A14B) with efficiency optimizations and long-context support.[15]
2024-09-19 Qwen2.5 18 trillion token training, multilingual support, 128K context.[16]
2024-09-18 Qwen2.5-Math Math-specialized instruction models (1.5B/7B/72B).[17]
2024-09 Qwen2-VL Vision-language series with dynamic resolution tokenization strategy.[18]
2024-11-28 QwQ-32B-Preview Open-source reasoning model designed to compete with OpenAI's o1.[19]
2025-01-28 Qwen2.5-Max Large-scale MoE model claiming superiority over DeepSeek V3.[20]
2025-01-29 Qwen2.5-VL Enhanced video understanding with temporal reasoning.[21]
2025-04-28 Qwen3 Thinking/non-thinking modes, enhanced reasoning, 36 trillion tokens.[22]
2025-07-23 Qwen3-Coder Open-source coding-focused models emphasizing agentic coding workflows.[23]
2025-07-24 Qwen-MT Translation model supporting 92 languages.[24]
2025-08-04 Qwen-Image Image generation model with complex text rendering.[25]
2025-09-05 Qwen3-Max Flagship hosted model with competitive performance.[22]
2025-09-11 Qwen3-Next Optimized model with >10x throughput improvement.[22]
2025-09-23 Qwen3Guard Safety guardrail model for real-time moderation.[26]

Architecture and Technical Features

Core Architecture

Qwen models are based on the Transformer architecture, which is the standard for modern LLMs. Key architectural features include:

  • Attention Mechanism: Uses self-attention mechanism. The Qwen2 series introduced Group Query Attention (GQA) in larger models to improve inference speed and reduce memory usage.[14]
  • Tokenizer: Custom tokenizer with over 150,000 token vocabulary size, efficiently representing text from multiple languages and reducing token count for non-English text.[13]
  • Position Embeddings: Evolution from Absolute Position Embeddings (ALiBi) to Rotary Position Embeddings (RoPE) for better long-context performance, with multimodal variants using M-RoPE (Multimodal Rotary Position Embedding).[18]
  • Architecture Types: Both dense and Mixture of Experts (MoE) variants, with MoE models activating only a subset of parameters per token for efficiency.[22]

Training Data Scale

The evolution of training data demonstrates significant scaling:

The pre-training data includes high-quality Chinese language data, multilingual text, code, mathematics, and multimodal data, supporting over 29 core languages and up to 92 languages in translation variants.

Model Sizes and Variants

Qwen3 Dense Models

Qwen3 MoE Models

Thinking and Non-Thinking Modes

Qwen3 introduces a hybrid approach to problem-solving with two distinct modes:[22]

  • Thinking Mode: The model takes time to reason step by step before delivering the final answer, ideal for complex problems requiring deeper thought
  • Non-Thinking Mode: Provides quick, near-instant responses suitable for simpler questions where speed is prioritized
  • Thinking Budget: Users can control how much "thinking" the model performs based on the task at hand

Multimodal Capabilities

Qwen-VL Series

The Qwen-VL (Vision-Language) series represents Qwen's multimodal models that can process both text and images:

Qwen2-VL

Released in August 2024, featuring:[28]

  • Naive Dynamic Resolution mechanism for processing images of varying resolutions
  • Multimodal Rotary Position Embedding (M-RoPE) for effective fusion of positional information
  • Support for videos over 20 minutes
  • Understanding of images at various resolutions and ratios
  • Fine-grained resolution and document parsing

Qwen2.5-VL

Released in January 2025, featuring:[21]

  • Enhanced OCR recognition capabilities
  • Multi-scenario, multi-language, and multi-orientation text recognition
  • Visual localization with bounding boxes
  • Structured output generation for documents, forms, and tables
  • Temporal reasoning for video understanding
  • Dynamic resolution and multi-image analysis

Qwen3-VL

The latest vision-language model featuring:[29]

  • Visual Agent capabilities for PC/mobile GUI operation
  • Advanced spatial perception and 3D grounding
  • DeepStack architecture for fine-grained visual detail capture
  • Text-Timestamp Alignment for precise video event localization

Qwen2.5-Omni

End-to-end multimodal model with unique capabilities:[30]

  • Thinker-Talker architecture for simultaneous text and speech generation
  • Real-time voice and video chat support
  • TMRoPE (Time-aligned Multimodal RoPE) for synchronizing video and audio timestamps
  • Processing of text, images, videos, and audio inputs
  • Bilingual support (English/Chinese) with low-latency interaction

QVQ (Visual Reasoning Model)

QVQ-72B-Preview is an experimental research model for enhanced visual reasoning:[3]

  • 70.3% on MMMU (Multimodal Massive Multi-task Understanding) benchmark
  • Superior performance on MathVision and OlympiadBench
  • Advanced multidisciplinary understanding and reasoning abilities

Qwen-Audio

Large language audio models supporting:[31]

  • Cross-modal processing between text and audio
  • Support for 30+ languages
  • Speech recognition and sound interpretation
  • Processing of audio up to 30 minutes

Specialized Models

Variant Release Date Parameters Focus Key Features
Qwen2.5-Coder September 2024 0.5B to 32B Coding Code generation, reasoning, fixing; 88.2 on HumanEval; state-of-the-art open-source code model[16]
Qwen2.5-Math September 2024 1.5B to 72B Mathematics Mathematical reasoning with CoT, PoT, and TIR; 72.0 on MathVista; surpasses GPT-4o[17]
Qwen-MT July 2025 Based on Qwen3 Translation 92 languages covering 95% of global population; reinforcement learning for accuracy[24]
Qwen3Guard September 2025 Based on Qwen3 Safety Real-time moderation, risk classification; SOTA on multilingual safety benchmarks[26]
QwQ-32B-Preview November 2024 32B Reasoning Enhanced multi-step reasoning; outperforms o1 in benchmarks; Apache 2.0 license[32]
Qwen-Image August 2025 20B Image Generation Complex text rendering, multi-line layouts, high visual quality[25]
Qwen-Image-Edit August 2025 Based on Qwen-Image Image Editing Precise text editing, semantic/appearance control[25]

Performance and Benchmarks

Qwen3 Performance

Qwen3-235B-A22B achieves competitive results compared to other top-tier models:[22]

Benchmark Qwen3-235B-A22B Qwen3-30B-A3B QwQ-32B
Arena-Hard 95.6 91.0 89.5
AIME 2024 85.7 80.4 Superior
LiveCodeBench 70.7 - Strong
CodeForces Elo - 1974 -

Comparative Benchmarks

Model MMLU-Redux (EN) HumanEval (EN) MathVista (EN) Notes
Qwen3-235B-A22B 88.5 N/A N/A Highest in series for general knowledge
Qwen2.5-72B-Instruct 68.8 (MMLU-Pro) N/A N/A Instruction-tuned
Qwen2.5-Coder-32B N/A 88.2 N/A Coding SOTA
Qwen2.5-Math-32B N/A N/A 72.0 Math reasoning champion
Qwen2.5-Max Comparable to GPT-4o Surpasses DeepSeek V3 N/A Overall competitive

Additional benchmarks include CMMLU for Chinese language understanding, GSM8K for math word problems, and BFCL for tool and function-calling capabilities.[33]

Capabilities

Qwen models support a comprehensive array of tasks:

  • Multilingual Processing: Core models handle 29 languages, with Qwen-MT extending to 92 languages covering 95% of the global population[24]
  • Long-Context Understanding: Up to 128K tokens, enabling analysis of extensive documents, codebases, or conversations
  • Coding and Mathematics: Specialized models achieve state-of-the-art results with Qwen2.5-Coder scoring 88.2 on HumanEval and Qwen2.5-Math achieving 72.0 on MathVista
  • Multimodal Tasks: Image generation/editing, video grounding, audio transcription, and multi-modal reasoning
  • Safety and Moderation: Qwen3Guard provides real-time detection with categorized risk levels
  • Agentic and Reasoning: Models like QwQ-32B support advanced chain-of-thought reasoning, tool use, and multi-step tasks
  • Structured Data Analysis: Enhanced capabilities for processing tables, forms, and structured documents
  • Real-time Interaction: Support for low-latency voice and video chat through Omni models

Deployment and Accessibility

Open Source Availability

Most Qwen models are released under the Apache 2.0 license, making them available for both research and commercial use.[22] However, some larger models like the original Qwen-72B were released under a more restrictive "Tongyi Qianwen LICENSE AGREEMENT" that requires approval for commercial applications with over 100 million monthly active users.[34]

Models are distributed through:

Commercial Services

Alibaba Cloud provides commercial APIs through:[36]

  • Alibaba Cloud Model Studio
  • DashScope API service
  • Qwen Chat web interface (chat.qwen.ai)

Deployment Frameworks

Qwen models support deployment through various frameworks:[35]

  • vLLM for high-throughput inference
  • SGLang for large-scale deployment
  • TensorRT-LLM for NVIDIA GPU optimization
  • Ollama for local deployment
  • llama.cpp for CPU and GPU inference
  • Integration with popular AI frameworks

Applications

Qwen powers diverse applications including:

  • Enterprise AI Solutions: Document analysis, customer service automation, and business intelligence
  • Software Development: Code generation, debugging, and agentic coding workflows through Qwen3-Coder
  • Education: Personalized tutoring, especially in mathematics and programming
  • Healthcare: Medical document analysis and research assistance
  • E-commerce: Product descriptions, customer support, and recommendation systems
  • Creative Content: Story writing, article generation, and image creation
  • Research: Academic paper analysis and scientific computing

Fine-tuned versions, such as "Liberated Qwen" by Abacus AI, remove content restrictions for specialized use cases.[1] The models support real-time interactions and are used in agentic frameworks for autonomous tasks.

Languages Supported

Qwen models provide extensive multilingual support, with Qwen3 supporting 119 languages and dialects.[1] Core language support includes:

Template:Div col

  • Chinese (Simplified and Traditional)
  • English
  • French
  • Spanish
  • Portuguese
  • German
  • Italian
  • Russian
  • Japanese
  • Korean
  • Vietnamese
  • Thai
  • Arabic
  • Turkish
  • Indonesian
  • Dutch
  • Polish
  • Swedish
  • Hindi
  • Hebrew
  • Finnish
  • Danish
  • Norwegian
  • Czech
  • Hungarian
  • Romanian
  • Greek
  • Bulgarian
  • Ukrainian

Template:Div col end

Limitations and Considerations

While Qwen models demonstrate strong capabilities, they have known limitations:[37]

  • Language mixing: Models may unexpectedly switch between languages during generation
  • Circular reasoning: Can get stuck in repetitive reasoning loops, particularly in complex multi-step problems
  • Safety concerns: Require stronger safety features and guardrails for production deployment
  • Performance gaps: While strong in math and coding, improvements needed in common sense reasoning
  • Context limitations: Although supporting long contexts, performance may degrade with extremely long inputs
  • Computational requirements: Larger models require significant GPU resources for deployment

Impact and Adoption

Since its initial open-source release in 2023, the Qwen family has achieved significant adoption milestones:

  • Over 40 million downloads across platforms[3]
  • More than 78,000 derivative models developed on Hugging Face[3]
  • Powers all top 10 open-source LLMs on Hugging Face's Open LLM Leaderboard as of February 2025[4]
  • Integrated into numerous commercial applications across Alibaba's ecosystem
  • Adopted by enterprises globally for various AI applications

The development reflects Alibaba's ambition to position Qwen as a foundational "operating system" for AI, akin to Android in mobile computing.[38]

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 https://en.wikipedia.org/wiki/Qwen
  2. 2.0 2.1 2.2 https://huggingface.co/Qwen
  3. 3.0 3.1 3.2 3.3 https://www.alibabacloud.com/blog/alibaba-cloud-unveils-new-research-model-for-enhanced-visual-reasoning_601914
  4. 4.0 4.1 https://www.scmp.com/tech/big-tech/article/3298233/alibabas-qwen-powers-top-10-open-source-models-china-ai-know-how-goes-beyond-deepseek
  5. 5.0 5.1 https://www.cnbc.com/2023/04/11/alibaba-launches-tongyi-qianwen-ai-model-similar-to-gpt.html
  6. https://www.reuters.com/technology/alibaba-cloud-launches-its-chatgpt-rival-tongyi-qianwen-2023-04-11/
  7. 7.0 7.1 https://www.scmp.com/tech/big-tech/article/3234661/alibabas-ai-model-tongyi-qianwen-approved-public-release-china
  8. https://www.reuters.com/technology/alibaba-launches-open-source-ai-model-qwen-2023-08-03/
  9. https://huggingface.co/Qwen/Qwen-1_8B
  10. https://www.chinadaily.com.cn/a/202312/01/WS6569a8a7a31090682a5f5b5e.html
  11. https://github.com/QwenLM/Qwen
  12. https://venturebeat.com/ai/alibaba-clouds-new-qwen-72b-llm-tops-open-source-rankings-and-rivals-gpt-4/
  13. 13.0 13.1 https://qwenlm.github.io/blog/qwen1.5/
  14. 14.0 14.1 https://qwenlm.github.io/blog/qwen2/
  15. 15.0 15.1 https://arxiv.org/abs/2407.10671
  16. 16.0 16.1 https://qwenlm.github.io/blog/qwen2.5/
  17. 17.0 17.1 https://arxiv.org/html/2409.12122v1
  18. 18.0 18.1 https://arxiv.org/abs/2409.12191
  19. https://venturebeat.com/ai/alibaba-releases-qwen-with-questions-an-open-reasoning-model-that-beats-o1-preview/
  20. 20.0 20.1 https://qwenlm.github.io/blog/qwen2.5-max/
  21. 21.0 21.1 https://qwenlm.github.io/blog/qwen2.5-vl/
  22. 22.0 22.1 22.2 22.3 22.4 22.5 22.6 https://qwenlm.github.io/blog/qwen3/
  23. https://www.reuters.com/world/china/alibaba-launches-open-source-ai-coding-model-touted-its-most-advanced-date-2025-07-23/
  24. 24.0 24.1 24.2 https://qwenlm.github.io/blog/qwen-mt/
  25. 25.0 25.1 25.2 https://qwenlm.github.io/blog/qwen-image/
  26. 26.0 26.1 https://qwenlm.github.io/blog/qwen3guard/
  27. https://arxiv.org/abs/2412.15115
  28. https://arxiv.org/abs/2409.12191
  29. https://github.com/QwenLM/Qwen3-VL
  30. https://github.com/QwenLM/Qwen2.5-Omni
  31. https://www.alibabacloud.com/help/en/model-studio/what-is-qwen-llm
  32. https://www.alibabacloud.com/blog/alibaba-cloud-unveils-open-source-ai-reasoning-model-qwq-and-new-image-editing-tool_601813
  33. https://qwenlm.github.io/share/leaderboard/
  34. https://www.infoq.com/news/2023/12/alibaba-qwen-72b-llm/
  35. 35.0 35.1 https://github.com/QwenLM/Qwen3
  36. https://www.alibabacloud.com/help/en/model-studio/models
  37. https://www.datacamp.com/blog/qwq-32b-preview
  38. https://www.reddit.com/r/LocalLLaMA/comments/1nq182d/alibaba_just_unveiled_their_qwen_roadmap_the/

External links