Mistral Medium 3.5

Large Language Models Open Source AI

9 min read

Updated Jun 2, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 2, 2026

Fact-checked

In review queue

Sources

8 citations

Revision

v1 · 1,728 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Mistral Medium 3.5 is an open-weight large language model released by Mistral AI on 28 April 2026. The company describes it as its "first flagship merged model": a single dense network of about 128 billion parameters that folds instruction-following, reasoning, and coding into one set of weights, replacing the separate models Mistral had previously shipped for each of those jobs.^[1]^[2] It became the default model in Le Chat, Mistral's consumer assistant, and in the company's Vibe coding tool, and it is published under a modified MIT license on Hugging Face.^[1]^[3]

The "merged" framing is the model's main selling point. Where Mistral had run a dedicated reasoning model (Magistral) and a dedicated coding model (Devstral 2) alongside its general Medium line, Medium 3.5 collapses those capabilities into a single checkpoint whose reasoning depth can be turned up or down per request.^[2]^[4]

Overview

Mistral Medium 3.5 is a dense, multimodal transformer with roughly 128 billion parameters and a context window of 256,000 tokens.^[1]^[3] It accepts text and image input and returns text, and it exposes a per-request control over how much reasoning it spends on a query, so the same weights can produce a fast chat reply or grind through a multi-step agentic task.^[1]^[3] Because the model is dense rather than a mixture of experts, every parameter is active for every generated token, which Mistral and several reviewers contrast with the sparse architectures used by some competing systems.^[4]

The model is distributed as open weights. Mistral published the checkpoint on Hugging Face under a "modified MIT" license, a permissive license that allows commercial use and fine-tuning but adds carve-outs for very large companies.^[1]^[5] In Mistral's own product surfaces it is reachable through Le Chat, the Vibe CLI, and the paid API; it is also offered through third-party platforms including NVIDIA's hosted endpoints and Microsoft Copilot Studio.^[1]^[6]

Mistral AI and the model lineup

Mistral AI is a French AI company founded in 2023 that has built much of its reputation on open-weight releases, from Mistral 7B and Mixtral through the closed Mistral Large frontier line. The "Medium" tier sits below Large and is positioned as a cost-efficient workhorse rather than the absolute top of the range.

Mistral Medium 3 opened that tier in 2025 as a model aimed at strong performance at a fraction of the price of frontier systems. Medium 3.5 is the consolidation point of the line: secondary coverage describes it as superseding the intermediate Medium 3.1 release while also retiring Magistral and the Codestral-derived Devstral 2 coding model into the same weights.^[2]^[4] Mistral's own announcement is narrower, stating that Medium 3.5 becomes the default in Le Chat and replaces Devstral 2 in the Vibe CLI; the broader claim that it also displaces Medium 3.1 and Magistral in Le Chat comes from press reporting and the Ollama model listing rather than the launch post.^[1]^[2]^[3]

Release

Mistral's documentation dates the model to 28 April 2026 and tags it version v26.04, the date that the model-card slug mistral-medium-3-5-26-04 also encodes.^[7] The weights and the API listing went up around the same time, and most early coverage placed the launch in late April or the first days of May 2026.^[2]^[4] A follow-up product post, "Remote agents in Vibe. Powered by Mistral Medium 3.5," tied the model to two new features, asynchronous cloud coding agents in Vibe and a "Work mode" preview in Le Chat, and that post carries a May 2026 date.^[1] Microsoft listed the model in Copilot Studio later in May.^[6]

Architecture, sizes, and weights

Attribute	Value	Source
Parameters	~128 billion (dense)	^[1]^[3]
Architecture	Dense transformer, multimodal	^[1]^[3]
Context window	256,000 tokens	^[1]^[3]
Modalities	Text and image input, text output	^[3]
Reasoning control	`reasoning_effort` per request (e.g. none / high)	^[1]^[3]
Vision encoder	Retrained from scratch for variable image sizes and aspect ratios	^[1]
License	Modified MIT (open weights)	^[1]^[5]
Self-hosting	Runs on as few as four GPUs	^[1]^[2]

The single most repeated technical detail is that Medium 3.5 is dense, not sparse. Reviewers note that all 128 billion parameters load and activate for every token, which makes the memory footprint predictable but means the model cannot lean on the conditional compute tricks of a mixture-of-experts design.^[4] Mistral says the model self-hosts on as few as four GPUs, which several writers read as four H100-class accelerators at reduced precision.^[1]^[2] The vision stack was rebuilt: Mistral states it trained the vision encoder "from scratch to handle variable image sizes and aspect ratios," making Medium 3.5 a genuine multimodal model rather than a text model with a bolted-on adapter.^[1]

Capabilities

The headline capability is the merge itself. A single request can be answered with reasoning switched off for low latency, or with reasoning set high for complex prompts and agentic runs, all from the same weights.^[1]^[3] Mistral pitches the model at long-horizon work: calling multiple tools reliably, sustaining multi-step coding sessions, and emitting structured output that downstream code can parse.^[1]^[6] Native function calling is supported, which is what lets Le Chat's "Work mode" drive tools in parallel until a task is finished.^[1]

On the product side, Medium 3.5 powers Vibe's remote agents, cloud-hosted coding sessions that can be spawned from the CLI or from Le Chat and can take over a local session, and the Work mode preview in Le Chat that chains tool calls across a multi-step job.^[1]

Benchmarks

Mistral published only a small set of headline scores at launch, both on agentic tasks, and did not release full results for general-knowledge or reasoning suites such as MMLU-Pro or GPQA Diamond; numbers for those circulating in the community are not Mistral-published and should be treated with caution.^[1]^[5]

Benchmark	Score	Notes	Source
SWE-Bench Verified	77.6%	Real-world software-engineering bug fixes; reported ahead of Devstral 2 and Qwen3.5 397B A17B	^[1]^[8]
τ³-Telecom	91.4	Multi-turn agentic tool calling	^[1]^[3]
MMLU-Pro / GPQA Diamond	Not disclosed at launch	Mistral did not publish these in the launch post	^[5]

The SWE-Bench Verified figure of 77.6% is the number most often cited, and Mistral presents it as beating its own retired Devstral 2 coding model along with Qwen3.5 397B A17B.^[1]^[8] Independent writers were more measured: TechSifted and others reported that the model leads on coding and telecom-style agent benchmarks but lags Claude badly on banking-domain tasks, and that it sits below proprietary frontier models overall while being roughly competitive with Claude Sonnet 4.5 on some coding tests.^[4]

Availability and pricing

Mistral ships Medium 3.5 through its own assistant and developer surfaces and through several partners. In Le Chat it is the default model, with Work mode available on the Pro, Team, and Enterprise plans; in Vibe it powers the CLI and the new remote agents; and the open weights are downloadable from Hugging Face for self-hosting.^[1] The model is also offered for prototyping on NVIDIA-accelerated endpoints and as an NVIDIA NIM container, and it appears in Microsoft Copilot Studio's model lineup for agent builders.^[1]^[6]

Channel	Access	Notes	Source
Le Chat	Default model; Work mode (Preview)	Pro, Team, Enterprise plans for Work mode	^[1]
Vibe CLI	Default model; remote cloud agents	Replaces Devstral 2	^[1]
API	$1.50 / 1M input tokens	Pay-per-token	^[1]^[4]
API	$7.50 / 1M output tokens	Pay-per-token	^[1]^[4]
Hugging Face	Open weights, modified MIT	Self-host (~4 GPUs)	^[1]^[5]
NVIDIA / Copilot Studio	Hosted endpoints, NIM, Copilot Studio	Third-party hosting	^[1]^[6]

API pricing of $1.50 per million input tokens and $7.50 per million output tokens is consistent across the announcement and independent coverage.^[1]^[4]

Reception

Reaction split along familiar lines. The consolidation drew praise as an operational simplification: one model to deploy and bill instead of separate reasoning, coding, and general checkpoints, with weights you can run on your own hardware.^[2]^[4] Mistral's continued commitment to open weights was treated as a notable counterpoint to the closed frontier labs.^[5]

The criticism centered on price and on the "open" label. Some developers argued that $1.50 in / $7.50 out is steep for a 128-billion-parameter model relative to comparably sized open competitors.^[2] Others pushed on the license: a "modified MIT" license that adds use-case and large-company restrictions is, as one reviewer put it, harder to justify as genuinely open in regulated industries than a plain permissive license would be.^[4] On capability, independent testing found the model strong on coding and agentic tool use but uneven across domains and below the proprietary frontier overall.^[4]

Limitations

Mistral's published evaluation is thin: only two agentic benchmarks were disclosed at launch, so the model's general-knowledge and hard-reasoning ceilings are harder to triangulate than its coding numbers, and the community scores filling that gap are not vendor-verified.^[5] The dense 128-billion-parameter design gives a predictable but non-trivial hardware footprint, with self-hosting pitched at four GPUs rather than something a single consumer card can run.^[1]^[2] Domain coverage is uneven, with reported weakness on banking-style tasks.^[4] And the "modified MIT" license, while permitting most commercial use, carries carve-outs that limit how cleanly the weights can be called open.^[4]^[5] At launch the Le Chat Work mode was labeled a preview rather than a finished feature.^[1]

References

Remote agents in Vibe. Powered by Mistral Medium 3.5. - Mistral AI ↩
Mistral Medium 3.5: 128B Open-Weight Model Replaces Devstral 2 and Magistral - Let's Data Science ↩
mistralai/Mistral-Medium-3.5-128B - Hugging Face model card ↩
Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model - The Decoder ↩
Mistral Medium 3.5: 128B Open-Weight Frontier Coder - Nerd Level Tech ↩
Mistral joins Copilot Studio's growing lineup of model providers - Microsoft Copilot Blog ↩
Mistral Medium 3.5 (v26.04) model card - Mistral AI Docs ↩
Mistral AI Launches Remote Agents in Vibe and Mistral Medium 3.5 with 77.6% SWE-Bench Verified Score - MarkTechPost ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

AI Model Release Timeline (2022-2026)Mistral Large 3

Overview

Mistral AI and the model lineup

Release

Architecture, sizes, and weights

Capabilities

Benchmarks

Availability and pricing

Reception

Limitations

References

Improve this article

Related Articles

LLaMA

Proprietary vs. Open Source Large Language Models (LLMs)

DeepSeek

LangChain

Meta AI

Mistral AI

What links here

Related Articles

LLaMA

Proprietary vs. Open Source Large Language Models (LLMs)

DeepSeek

LangChain

Meta AI

Mistral AI

What links here