Ministral 3B / 8B

AI Models Large Language Models Open Source AI

7 min read

Updated Jun 8, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 8, 2026

Fact-checked

In review queue

Sources

5 citations

Revision

v1 · 1,473 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Ministral is a family of two small language models released by the French artificial intelligence company Mistral AI on October 16, 2024. Marketed under the wordplay name "les Ministraux," the family consists of Ministral 3B and Ministral 8B, named for their approximately 3 billion and 8 billion parameters. The release marked the first anniversary of Mistral 7B, the model that established Mistral AI as a leading open-weight developer. Both Ministral models target on-device and edge AI use cases, including local inference, privacy-sensitive applications, low-latency assistants, and lightweight agentic sub-tasks such as function calling within larger pipelines.^[1]^[2]

Overview

The Ministral models occupy the smallest tier of Mistral AI's lineup, sitting below mid-sized and frontier models such as Mistral NeMo and Mistral Large. Mistral positioned the pair for what it described as a growing demand for "local, privacy-first inference for critical applications," citing examples such as on-device translation, internet-less smart assistants, local analytics, and autonomous robotics.^[1] Beyond standalone deployment, Mistral promoted the models as efficient intermediaries inside agentic workflows, where a small model can perform input parsing, task routing, and API function-calling at low latency and cost before handing more demanding work to a larger model.^[1]

Both models support a context window of up to 128,000 tokens, although the inference framework vLLM supported 32,000 tokens at launch.^[1] Mistral released the family with benchmark claims asserting state-of-the-art performance in the sub-10-billion-parameter category, though some independent evaluations later tempered those claims (see Architecture and benchmarks).^[3]

Mistral AI and the small-model context

Mistral AI, founded in Paris in 2023 by former DeepMind and Meta AI researchers, built its early reputation on releasing capable open-weight models. Mistral 7B, released in September 2023 under the permissive Apache 2.0 license, demonstrated that a relatively compact model could outperform larger contemporaries on many tasks, and it became a widely adopted base for fine-tuning.^[1]

The Ministral release arrived amid an industry-wide push toward smaller, more efficient models that can run on consumer hardware rather than data-center accelerators. In the months surrounding the launch, competing small models included Meta's Llama 3.2 1B and 3B, Google's Gemma 2 2B and 9B, and Microsoft's Phi series. These models reflected a shift in emphasis from raw scale toward efficiency, on-device privacy, and cost per token, trends that the Ministral family was explicitly designed to address.^[2]^[3]

The Ministral models (3B and 8B)

The family comprises two instruction-tuned models intended for different deployment points. Ministral 3B is the smaller and cheaper option, aimed at the most constrained environments, while Ministral 8B offers higher quality at a still-modest footprint. According to Mistral, Ministral 3B already outperformed the year-old Mistral 7B on most internal benchmarks despite having less than half the parameters, illustrating the efficiency gains achieved over the preceding year.^[1]^[4]

The two models differ significantly in availability. Mistral released the weights for Ministral 8B Instruct on Hugging Face for research use, under the model identifier mistralai/Ministral-8B-Instruct-2410. Ministral 3B was not released as open weights and was offered through Mistral's commercial API only.^[1]^[2] Both were made available on Mistral's hosted platform, La Plateforme, under the API model names ministral-8b-latest and ministral-3b-latest.^[1]

The specifications below describe the original October 2024 release. In December 2025, Mistral introduced a successor generation branded "Ministral 3" (Ministral 3 3B, 8B, and 14B), released under the Apache 2.0 license, which should not be confused with the original Ministral family described here.^[5]

Specifications

Specification	Ministral 3B	Ministral 8B (Instruct-2410)
Developer	Mistral AI	Mistral AI
Release date	October 16, 2024	October 16, 2024
Parameters	~3 billion	8,019,808,256 (~8B)
Architecture	Dense transformer	Dense transformer
Layers	Not publicly disclosed	36
Hidden dimension	Not publicly disclosed	4,096
Attention heads / KV heads	Not publicly disclosed	32 / 8 (grouped-query attention)
Head dimension	Not publicly disclosed	128
Vocabulary size	Not publicly disclosed	131,072 (V3-Tekken tokenizer)
Context window	Up to 128,000 tokens	Up to 128,000 tokens
Attention pattern	Standard	Interleaved sliding-window attention
Weights released	No (API only)	Yes (Instruct, research use)
License	Mistral Commercial License	Mistral Research License + Mistral Commercial License
La Plateforme price (input and output)	$0.04 / million tokens	$0.10 / million tokens

Sources: Mistral AI announcement and the Ministral-8B-Instruct-2410 model card.^[1]^[4] Detailed architecture figures for Ministral 3B were not published by Mistral.

Architecture and benchmarks

Both Ministral models are dense transformer language models. The headline architectural feature is Ministral 8B's interleaved sliding-window attention, which Mistral describes as a "special interleaved sliding-window attention pattern for faster and memory-efficient inference."^[1] The Hugging Face model card characterizes the attention configuration as a ragged pattern, alternating a full 128,000-token window with shorter 32,000-token sliding windows across layers, which reduces the memory cost of long-context inference while retaining the ability to attend across the full sequence.^[4] Ministral 8B uses grouped-query attention with 8 key-value heads against 32 query heads, and the V3-Tekken tokenizer with a vocabulary of 131,072 tokens.^[4]

On benchmarks, Mistral reported that the models set "a new frontier in knowledge, commonsense, reasoning, function-calling, and efficiency in the sub-10B category."^[1] On the MMLU knowledge benchmark, Mistral's figures placed Ministral 3B at 60.9 and Ministral 8B at 65.0. For comparison, Mistral cited Gemma 2 2B at 52.4, Llama 3.2 3B at 56.2, and Llama 3.1 8B at 64.7, positioning each Ministral model ahead of its size-class peers.^[3] Mistral further stated that Ministral 3B surpassed Gemma 2 2B and Llama 3.2 3B across AGIEval, TriviaQA, GSM8K, HumanEval, and multilingual tasks, while Ministral 8B outperformed both Mistral 7B and Llama 3.1 8B on most reported benchmarks.^[3]

These comparisons should be read with attribution, because they come from Mistral's own evaluations. Independent testing by Artificial Analysis, reported by DeepLearning.AI's The Batch, reached more modest conclusions, placing Ministral 3B behind Llama 3.2 3B and Ministral 8B behind both Llama 3.1 8B and Gemma 2 9B on MMLU and MATH.^[3] The discrepancy reflects the well-known sensitivity of small-model benchmarks to prompting, few-shot configuration, and evaluation methodology. The Ministral 8B Instruct model card also reports chat and coding results, including an MT-Bench score of 8.3, an Arena Hard score of 70.9, and a HumanEval pass@1 of 76.8.^[4]

Licensing and availability

Licensing differs between the two models and is one of the most frequently misstated aspects of the release. Ministral 8B was published under two options: the Mistral Research License, a non-commercial license that permits research and evaluation use of the released weights, and the Mistral Commercial License for production deployment. The Hugging Face model card states plainly that commercial use of the open weights requires contacting Mistral AI for a license.^[1]^[4] Ministral 3B was offered under the Mistral Commercial License only and was not distributed as open weights; access was through the API.^[1]

Both models were available immediately on La Plateforme, Mistral's hosted API, priced at $0.04 per million tokens for Ministral 3B and $0.10 per million tokens for Ministral 8B, applied uniformly to input and output tokens.^[1] Mistral indicated that the models would also become available through cloud partners, and that customers requiring self-deployment of either model could arrange commercial licensing directly. The open weights for Ministral 8B Instruct were distributed in formats compatible with the vLLM inference library, which Mistral recommended for production pipelines.^[4]

Significance

The Ministral release reinforced the strategic importance of small, efficient models in 2024, a period when several major developers competed to deliver capable models that could run locally on phones, laptops, and edge devices rather than in the cloud. By pairing an API-only 3-billion-parameter model with a partially open 8-billion-parameter model, Mistral pursued a hybrid distribution strategy that balanced community access against commercial revenue, a shift from the fully permissive Apache 2.0 approach of Mistral 7B.^[1]^[2]

The family's emphasis on function-calling and agentic sub-tasks also anticipated the broader move toward agent-based systems, in which inexpensive small models handle routing and tool use while larger models reason over complex problems. Although independent benchmarks suggested the original models did not uniformly lead their class, the Ministral line established a durable product tier for Mistral that the company continued to develop, culminating in the Apache 2.0-licensed Ministral 3 generation released in December 2025.^[3]^[5]

References

Mistral AI. "Un Ministral, des Ministraux." Mistral AI News, October 16, 2024. https://mistral.ai/news/ministraux ↩
Wheatley, Mike. "Mistral introduces Ministral 3B and 8B AI computing models for phones and laptops." SiliconANGLE, October 16, 2024. https://siliconangle.com/2024/10/16/mistral-introduces-ministral-3b-8b-device-ai-computing-models/ ↩
The Batch. "Mistral AI Unveils Ministral 3B and 8B Models, Outperforming Rivals in Small-Scale AI." DeepLearning.AI. https://www.deeplearning.ai/the-batch/mistral-ai-unveils-ministral-3b-and-8b-models-outperforming-rivals-in-small-scale-ai ↩
Mistral AI. "Ministral-8B-Instruct-2410." Hugging Face model card. https://huggingface.co/mistralai/Ministral-8B-Instruct-2410 ↩
Mistral AI. "Introducing Mistral 3." Mistral AI News, December 2025. https://mistral.ai/news/mistral-3/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

Best Small Language Models Mistral 7B Mixtral

Overview

Mistral AI and the small-model context

The Ministral models (3B and 8B)

Specifications

Architecture and benchmarks

Licensing and availability

Significance

References

Improve this article

Related Articles

Llama 3

OLMo

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

What links here

Related Articles

Llama 3

OLMo

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

What links here