Qwen3.7-Max

Chinese AI Large Language Models

9 min read

Updated Jun 2, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 2, 2026

Fact-checked

In review queue

Sources

9 citations

Revision

v1 · 1,837 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Qwen3.7-Max is a closed-weight frontier large language model developed by Alibaba's Qwen team, announced in May 2026 as the flagship of the Qwen3.7 generation. Alibaba positions it as a proprietary model "designed for the agent era," built less as a conversational chatbot and more as a foundation for autonomous agents that write code, call tools, and run long-horizon tasks with limited human supervision. ^[1]^[2] The model supports extended thinking and a context window of up to one million tokens, and it is offered through Alibaba Cloud's Model Studio platform rather than as a public weight release, a notable departure from the open-weight lineage that made the Qwen brand widely known. ^[2]^[3]^[4]

Overview

Qwen3.7-Max sits at the top of Alibaba's 2026 model lineup as the company's most capable proprietary system. Where most of the Qwen family has shipped with downloadable weights under permissive licenses, the "Max" tier is the line Alibaba keeps closed and serves only through its cloud API. ^[3]^[5] The Qwen3.7-Max release continued that pattern and pushed it further: the model is text-only, reasoning-first, and tuned for agentic workflows such as software engineering, tool use, and office automation, rather than for one-shot chat. ^[1]^[2]

Independent evaluation placed it among the strongest models available at launch. On the Artificial Analysis Intelligence Index, Qwen3.7-Max scored 56.6, ranking fifth overall and making it the highest-placed Chinese model on that leaderboard, ahead of Google's Gemini 3.5 Flash. ^[3]^[6] That score trailed only a handful of Western frontier systems, including OpenAI's GPT-5.5, Anthropic's Claude Opus 4.7, and Google's Gemini 3.1 Pro Preview. ^[6]

The Qwen-Max closed line versus the open Qwen3.x models

Understanding Qwen3.7-Max requires separating it from the broader Qwen catalog, which is split between open and closed tiers. Alibaba Cloud has released a large number of open-weight Qwen models, including dense and mixture-of-experts variants, multimodal systems, and coding-focused models such as Qwen3-Coder. The "Max" models are different. They are the company's proprietary flagships, never published as weights and accessible only via Model Studio. The earlier Qwen3-Max followed this closed pattern, and Qwen3.7-Max is its successor. ^[3]^[5]

Qwen3.7-Max is therefore distinct from the open-weight Qwen3.x releases and from the intermediate closed flagships that preceded it. The Qwen3.7 series advanced beyond Qwen3.6 and Qwen3.5: Alibaba reported that the Intelligence Index rose by 4.8 points relative to the Qwen3.6 Max preview, from 51.8 to 56.6, and that the context window expanded from the 256K tokens of the Qwen3.6 generation to one million. ^[3]^[4] As of late May 2026, no Qwen3.7 weights had been published on Hugging Face, confirming the model's closed status at launch. ^[3]

The table below summarizes how the Max tier differs from the rest of the family.

Attribute	Open Qwen3.x models	Qwen3.7-Max
Weights	Published (Hugging Face, ModelScope)	Closed, not released ^[3]
Access	Self-host or API	Alibaba Cloud Model Studio API only ^[4]
Positioning	General models, coding, multimodal	Proprietary agent-first flagship ^[1]^[2]
Example members	Qwen3, Qwen3-Coder, Qwen3-VL	Qwen3-Max, Qwen3.7-Max ^[3]^[5]

Release

Alibaba's Qwen team published the technical announcement, titled "Qwen3.7: The Agent Frontier," on the Alibaba Cloud blog on May 21, 2026, describing Qwen3.7-Max as "our latest proprietary model designed for the agent era." ^[2] The model was presented publicly at the 2026 Alibaba Cloud Summit, held in Hangzhou around May 20, 2026, with the commercial API going live on Model Studio shortly before the event, on May 19. ^[3]^[4]

A preview phase preceded the formal launch. Two preview entries, reported as Qwen3.7-Max-Preview and a Qwen3.7-Plus-Preview, appeared on a public model arena leaderboard in mid-May, and the production release dropped the "-Preview" suffix once the model became generally available. ^[4] Several third-party hosts, including OpenRouter and Together AI, cross-listed the model at or near launch. ^[3]^[5]

Architecture as disclosed

Alibaba disclosed very little about the internal design of Qwen3.7-Max. The official announcement does not state a parameter count, an expert configuration for any mixture-of-experts layout, an activation size, or attention details, and several reviewers noted the absence of a full technical report. ^[2]^[7] What is documented is operational rather than structural:

The model is text-only for both input and output, with no native image support at launch. ^[1]^[8]
It exposes a preserve_thinking API parameter that carries the model's internal thinking content across turns in a conversation, which Alibaba recommends for agentic tasks. ^[2]
It is served with a context window of up to one million tokens in production, although the benchmark tables in the launch post were footnoted at a 256K context setting. ^[2]^[4]^[8]

Because Alibaba has not released weights or a detailed model card, claims about the underlying architecture remain unverifiable, and this article does not assert specifics that the company has not confirmed.

Capabilities, long context, and thinking

Qwen3.7-Max is a reasoning model: it produces an internal chain-of-thought before emitting a final answer, a mode Alibaba describes as extended thinking. ^[3]^[2] The design goal stated in the announcement is sustained, long-horizon autonomy rather than short conversational turns. ^[1]^[2]

Long context is a central feature. The one-million-token window lets the model hold large codebases, long document sets, or extended agent trajectories in a single session, and it roughly quadruples the 256K limit of the prior Qwen3.6 generation. ^[3]^[4] Vendors describe the model as built for agentic workloads: coding and debugging, tool use, office and productivity automation, and tasks that span hundreds or thousands of steps. ^[1]^[5]

Alibaba's headline demonstration of long-horizon behavior was a kernel-optimization run. According to the announcement, the model executed roughly 35 hours of continuous autonomous work, performing 432 kernel evaluations across 1,158 tool calls, and achieved a 10.0x geometric-mean speedup over a reference implementation on previously unseen hardware. ^[2] The company framed this as evidence that the model maintains a coherent strategy across more than a thousand tool calls without losing context. ^[2] These figures come from Alibaba's own internal testing and have not been independently reproduced.

For integration, Qwen3.7-Max is served through an OpenAI-compatible chat-completions endpoint and an Anthropic-compatible protocol, and Alibaba states it works with agent frameworks including Claude Code and Qwen Code, with native support for the Model Context Protocol. ^[2]^[8]

Benchmarks

Two categories of benchmark data circulated at launch: Alibaba's own self-reported scores from the announcement, and the independent Intelligence Index from Artificial Analysis. They are presented separately below because they come from different methodologies and should not be conflated.

The independent composite score is the cleanest single number. Artificial Analysis aggregates ten evaluations, including GDPval-AA, Terminal-Bench Hard, SciCode, AA-Omniscience, Humanity's Last Exam, and GPQA Diamond, into one index. ^[6]

Metric (independent)	Qwen3.7-Max	Source
Artificial Analysis Intelligence Index	56.6	^[6]
Rank at launch	#5 overall, highest Chinese model	^[3]^[6]
Comparison point	Ahead of Gemini 3.5 Flash (55.3)	^[6]
Output verbosity (eval)	~97M tokens generated vs ~24M median	^[3]^[6]

Alibaba's own reported figures, drawn from the launch post, emphasize coding, agents, and reasoning. These are vendor-reported and were footnoted at a 256K context setting. ^[2]

Benchmark (Alibaba-reported)	Qwen3.7-Max	Source
SWE-bench Verified	80.4	^[2]
SWE-Pro	60.6	^[2]
Terminal-Bench 2.0	69.7	^[2]
MCP-Mark	60.8	^[2]
GPQA Diamond	92.4	^[2]
HMMT 2026	97.1	^[2]
SpreadSheetBench-v1	87.0	^[2]

Reviewers flagged a few caveats in the independent numbers. On AA-Omniscience, a knowledge-and-abstention test, the model's raw accuracy dropped relative to the prior Max preview while its abstention rate rose, and Artificial Analysis observed unusually high token generation during evaluation, marking the model as verbose. ^[7]^[6]

Availability and pricing

Qwen3.7-Max is API-only, delivered through Alibaba Cloud Model Studio, with no consumer weight download. ^[4]^[8] The official launch post stated that pricing would follow, and Alibaba published the rate card on Model Studio in the days after the summit; the same rates were mirrored by third-party hosts such as OpenRouter and Together AI. ^[2]^[3]^[5]

Item	Detail	Source
Access	Alibaba Cloud Model Studio (API only)	^[4]^[8]
API compatibility	OpenAI-compatible and Anthropic-compatible endpoints	^[2]^[8]
Input price	$2.50 per 1M tokens	^[3]^[5]
Output price	$7.50 per 1M tokens	^[3]^[5]
Cached input	$0.25 per 1M tokens (about 90% off input)	^[3]^[9]
Launch promotion	50% off, listed near $1.25 input / $3.75 output	^[9]
Third-party hosts	OpenRouter, Together AI	^[3]^[5]^[9]

Commentators noted that the per-token rate landed at roughly half the price of comparable Western flagships, which several framed as aggressive pricing for a model near the top of the leaderboard. ^[3]^[5]

Reception

Coverage treated Qwen3.7-Max as a significant entry from China's most active frontier-model shipper of 2026, and the closed-weight decision drew particular attention given Alibaba's open-source reputation. ^[3]^[7] The reasoning-agent framing was widely echoed: outlets described the model as an "agent frontier" system aimed at autonomous, long-running work rather than chat, and highlighted the one-million-token context and the multi-hour kernel-optimization demo as the standout claims. ^[1]^[7]^[5] The fifth-place finish on the independent Intelligence Index, and the status as the top-ranked Chinese model, were the most cited results. ^[3]^[6]

Limitations

Several limitations were noted at and after launch. The model is text-only, so it cannot natively process images or other modalities. ^[1]^[8] Alibaba did not publish a full technical report or the model weights, leaving architecture and training details undisclosed and many claims dependent on the company's own statements. ^[2]^[7] Independent evaluators flagged high verbosity, which can raise effective costs in long agentic sessions despite the favorable per-token rate, and a drop in raw knowledge accuracy paired with higher abstention on at least one benchmark. ^[6]^[7] As a proprietary, API-gated model, it also cannot be self-hosted, audited at the weight level, or run offline, in contrast to the open Qwen3.x releases. ^[3]^[5]

References

Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-Token Context Window - MarkTechPost, May 21, 2026. ↩
Qwen3.7: The Agent Frontier - Alibaba Cloud blog, May 21, 2026. ↩
Qwen 3.7 Max: Alibaba's New Flagship AI Model 2026 - DigitalApplied. ↩
Qwen3.7-Max: Alibaba's New Agent-First LLM for Coding, Reasoning, and Long-Horizon AI Workflows - Analytics Vidhya, May 2026. ↩
Qwen3.7 Max - API Pricing & Benchmarks - OpenRouter. ↩
Alibaba's Qwen 3.7 Max Becomes Highest-Placed Chinese Model On Artificial Analysis Index, Is Ahead Of Gemini 3.5 Flash - OfficeChai. ↩
Qwen3.7-Max: Features, Benchmarks and Agent Capabilities - DataCamp. ↩
Qwen3.7 Max - Intelligence, Performance & Price Analysis - Artificial Analysis. ↩
Qwen3.7 Max API Pricing 2026 - Costs, Performance & Providers - PricePerToken. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

AI Model Release Timeline (2022-2026)Best AI Models for Reasoning and Math LLM Context Window Comparison Qwen

Overview

The Qwen-Max closed line versus the open Qwen3.x models

Release

Architecture as disclosed

Capabilities, long context, and thinking

Benchmarks

Availability and pricing

Reception

Limitations

References

Improve this article

Related Articles

Baidu AI

MiniMax

Moonshot AI

Qwen

Zhipu AI

DeepSeek-R1

What links here

Related Articles

Baidu AI

MiniMax

Moonshot AI

Qwen

Zhipu AI

DeepSeek-R1

What links here