MiniMax M2.7

Chinese AI Large Language Models Reasoning Models

7 min read

Updated Jul 17, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 17, 2026

Fact-checked

In review queue

Sources

7 citations

Revision

v2 · 1,366 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

MiniMax M2.7 is a large language model released by the Chinese AI company MiniMax on March 18, 2026.^[1] It is an agentic coding and reasoning model built on a sparse mixture of experts architecture, with about 230 billion total parameters and roughly 10 billion active per token. M2.7 is best known for one claim in particular: MiniMax described it as the company's first model to take part in its own development, with an internal version autonomously handling a meaningful share of the reinforcement learning research workflow that produced it.^[1] The model sits directly before MiniMax M3 in the company's lineup and was its most capable release until M3 arrived in June 2026.

Where M2.7 fits in the MiniMax lineup

M2.7 is the last model in MiniMax's M2 generation. That generation began with MiniMax M2 in late October 2025, a model positioned for what the company called the "agentic era," and it moved quickly through several point releases. MiniMax M2.5 shipped on February 12, 2026, followed by M2.7 about five weeks later. All of these share the same underlying design: a mixture-of-experts transformer with 230 billion total parameters and 10 billion active. The point releases were refinements of that architecture rather than ground-up rebuilds.

M2.7 is the bridge to the next generation. MiniMax M3, released on June 1, 2026, was a larger architectural change, with a new attention scheme, a much longer context window, native multimodal input, and computer use. So M2.7 marks the high point of the M2 line, and the model the company says it leaned on to help build what came after. Reading M2.7 alongside M3 gives a sense of how fast MiniMax was iterating in the first half of 2026.

Architecture

M2.7 uses a sparse mixture-of-experts design. The full network holds about 230 billion parameters, but only a fraction run on any given token. NVIDIA's technical documentation for the model lists 256 local experts with 8 activated per token, which works out to roughly 10 billion active parameters and a 4.3% activation rate.^[4] The idea is familiar from other MoE systems: keep the capacity of a very large model while paying inference costs closer to a 10-billion-parameter dense model. A top-k router decides which experts handle each token.

The model is text-only. It does not take images or other modalities, which is one of the clearer dividing lines between M2.7 and the multimodal M3 that followed. The context window is about 205,000 tokens. NVIDIA cites a 200K input length,^[4] OpenRouter lists 204,800 tokens for input and output combined, and the maximum output is 131,072 tokens.^[7]

Specification	Detail
Developer	MiniMax
Release date	March 18, 2026^[1]
Architecture	Sparse mixture of experts (text only)
Total parameters	About 230 billion
Active parameters	About 10 billion per token
Experts	256 local experts, 8 activated per token^[4]
Context window	About 205,000 tokens (131,072 max output)^[7]
Predecessor	MiniMax M2.5
Successor	MiniMax M3

The self-development claim

The headline feature, and the thing worth treating carefully, is what MiniMax called "self-evolution." The company titled its announcement "Early Echoes of Self-Evolution,"^[1] and the framing is deliberately modest. M2.7 did not train itself end to end. What MiniMax described is narrower: an internal version of the model acted as an autonomous research assistant inside the team's own reinforcement learning pipeline, and handled part of the iterative engineering work that usually falls to human researchers.

MiniMax reported two concrete examples. In the first, an internal M2.7 ran an iterative loop over its own training scaffold: analyze failed runs, plan a change, modify the scaffold code, run an evaluation, compare results, then decide to keep or revert the change. The company says this loop ran autonomously for more than 100 rounds and produced about a 30% improvement on internal evaluation sets, including by searching for better sampling settings such as temperature and frequency and presence penalties.^[1] In the second, the model competed on machine-learning engineering tasks, iterating over multiple 24-hour trials.

To run that loop, the team built a deliberately simple harness with three parts: a short-term memory written to a markdown file after each round, a self-feedback step where the model criticizes its own latest results, and a self-optimization step where it proposes what to try next. MiniMax says this "Agentic Researcher" setup let M2.7 cover an estimated 30% to 50% of the reinforcement learning workflow inside its team, with human researchers stepping in for the important decisions.^[1] VentureBeat, which covered the launch, described the model in those terms: self-evolving, and able to perform 30% to 50% of the RL research workflow.^[2]

A few caveats are worth keeping in mind. These figures come from MiniMax's own internal evaluations and have not been independently reproduced. "Self-evolution" here means an agent automating chunks of an existing engineering pipeline, not a model autonomously rewriting its own weights or goals. And the 30% to 50% range is an estimate of workflow coverage from the company's RL team, not a precisely measured benchmark. The capability is real and notable, but the marketing word does more lifting than the underlying mechanism does.

Benchmarks

MiniMax positioned M2.7 mainly as a coding and agentic model, and its reported scores cluster around software engineering and tool-use tasks. The numbers below come from MiniMax's announcement and were repeated across coverage of the release.^[1]

Benchmark	Score	What it measures
SWE-Pro	56.22%^[3]	Real-world software engineering tasks
Terminal Bench 2	57.0%^[3]	Command-line and terminal tasks
SWE Multilingual	76.5	Software fixes across multiple languages
Multi SWE Bench	52.7	Multi-repository software engineering
VIBE-Pro	55.6%	Coding and product tasks
NL2Repo	39.8%	Building a repository from a natural-language spec
MLE Bench Lite	66.6% medal rate	22 machine-learning engineering competitions
GDPval-AA	1495 ELO	Economically valuable knowledge work
Toolathon	46.3%	Tool-use accuracy
MM Claw	62.7%	End-to-end agent task accuracy

On the MLE Bench Lite set of 22 competitions, MiniMax reported a best run of 9 gold, 5 silver, and 1 bronze medal, for an average 66.6% medal rate.^[1] The company framed the SWE-Pro result as approaching the level of much larger frontier systems while running at a fraction of their cost. As with the self-development numbers, these are vendor-reported scores; readers comparing models should check independent leaderboards where available.

Availability and pricing

At launch on March 18, 2026, M2.7 was offered through the MiniMax Agent product and the MiniMax API platform, including a dedicated coding plan, and through OpenRouter.^[1] NVIDIA also made it available as an optimized NIM microservice in its catalog, with the company reporting throughput gains on Blackwell Ultra hardware over the following month.^[4]

Open weights followed in April 2026, when MiniMax published the model to Hugging Face under the repository MiniMaxAI/MiniMax-M2.7.^[3]^[6] The license is a modified-MIT variant that restricts commercial use without written authorization from MiniMax, which makes M2.7 open weights rather than fully open source in the usual sense.^[6] Quantized community builds, including NVFP4 and GGUF formats, appeared shortly after for local use. On hosted APIs the model was priced cheaply for its capability, around $0.30 per million input tokens and $1.20 per million output tokens, with a discounted rate for cache hits.^[7]

References

MiniMax. "MiniMax M2.7: Early Echoes of Self-Evolution." minimax.io. https://www.minimax.io/news/minimax-m27-en ↩
VentureBeat. "New MiniMax M2.7 proprietary AI model is 'self-evolving' and can perform 30-50% of reinforcement learning research workflow." March 2026. https://venturebeat.com/technology/new-minimax-m2-7-proprietary-ai-model-is-self-evolving-and-can-perform-30-50 ↩
MarkTechPost. "MiniMax Just Open Sourced MiniMax M2.7: A Self-Evolving Agent Model that Scores 56.22% on SWE-Pro and 57.0% on Terminal Bench 2." April 12, 2026. https://www.marktechpost.com/2026/04/12/minimax-just-open-sourced-minimax-m2-7-a-self-evolving-agent-model-that-scores-56-22-on-swe-pro-and-57-0-on-terminal-bench-2/ ↩
NVIDIA. "MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platforms for Complex AI Applications." NVIDIA Technical Blog. https://developer.nvidia.com/blog/minimax-m2-7-advances-scalable-agentic-workflows-on-nvidia-platforms-for-complex-ai-applications/ ↩
Artificial Analysis. "MiniMax-M2.7 - Intelligence, Performance & Price Analysis." https://artificialanalysis.ai/models/minimax-m2-7
Hugging Face. "MiniMaxAI/MiniMax-M2.7." https://huggingface.co/MiniMaxAI/MiniMax-M2.7 ↩
OpenRouter. "MiniMax M2.7 - API Pricing & Benchmarks." https://openrouter.ai/minimax/minimax-m2.7 ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

Where M2.7 fits in the MiniMax lineup

Architecture

The self-development claim

Benchmarks

Availability and pricing

References

Improve this article

Related Articles

DeepSeek-R1

DeepSeek-R1-Distill

DeepSeek V3.1

QwQ

MiniMax M1

DeepSeek-Prover