SoTA Feed — Every open-weights release from the labs that matter

Ad: Read SoTA Feed without this slot — ad-free site plus a personal ad-free feed URL $3/month

MiniMax-M2

Oct 22, 2025 · MiniMax · license: other · view on Hugging Face ↗
230 GB · MoE: 229B total, ≈11B (≈11.1 GB) active


Join Our 💬 WeChat | 🧩 Discord community.
MiniMax Agent | ⚡️ API (Now Free for a limited time!) | MCP | MiniMax Website
🤗 Hugging Face | 🐙 GitHub | 🤖️ ModelScope | 📄 License: MIT

Meet MiniMax-M2

Today, we release and open source MiniMax-M2, a Mini model built for Max coding & agentic workflows.

MiniMax-M2 redefines efficiency for agents. It's a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence. With just 10 billion activated parameters, MiniMax-M2 provides the sophisticated, end-to-end tool use performance expected from today's leading models, but in a streamlined form factor that makes deployment and scaling easier than ever.


Highlights

Superior Intelligence. According to benchmarks from Artificial Analysis, MiniMax-M2 demonstrates highly competitive general intelligence across mathematics, science, instruction following, coding, and agentic tool use. Its composite score ranks #1 among open-source models globally.

Advanced Coding. Engineered for end-to-end developer workflows, MiniMax-M2 excels at multi-file edits, coding-run-fix loops, and test-validated repairs. Strong performance on Terminal-Bench and (Multi-)SWE-Bench–style tasks demonstrates practical effectiveness in terminals, IDEs, and CI across languages.

Agent Performance. MiniMax-M2 plans and executes complex, long-horizon toolchains across shell, browser, retrieval, and code runners. In BrowseComp-style evaluations, it consistently locates hard-to-surface sources, maintains evidence traceable, and gracefully recovers from flaky steps.

Efficient Design. With 10 billion activated parameters (230 billion in total), MiniMax-M2 delivers lower latency, lower cost, and higher throughput for interactive agents and batched sampling—perfectly aligned with the shift toward highly deployable models that still shine on coding and agentic tasks.


Coding & Agentic Benchmarks

These comprehensive evaluations test real-world end-to-end coding and agentic tool use: editing real repos, executing commands, browsing the web, and delivering functional solutions. Performance on this suite correlates with day-to-day developer experience in terminals, IDEs, and CI.

BenchmarkMiniMax-M2Claude Sonnet 4Claude Sonnet 4.5Gemini 2.5 ProGPT-5 (thinking)GLM-4.6Kimi K2 0905DeepSeek-V3.2
SWE-bench Verified69.472.7 *77.2 *63.8 *74.9 *68 *69.2 *67.8 *
Multi-SWE-Bench36.235.7 *44.3//3033.530.6
SWE-bench Multilingual56.556.9 *68//53.855.9 *57.9 *
Terminal-Bench46.336.4 *50 *25.3 *43.8 *40.5 *44.5 *37.7 *
ArtifactsBench66.857.3*61.557.7*73*59.854.255.8
BrowseComp4412.219.69.954.9*45.1*14.140.1*
BrowseComp-zh48.529.140.832.26549.528.847.9*
GAIA (text only)75.768.371.260.276.471.960.263.5
xbench-DeepSearch7264.6665677.8706171
HLE (w/ tools)31.820.324.528.4 *35.2 *30.4 *26.9 *27.2 *
τ²-Bench77.265.5*84.7*59.280.1*75.9*70.366.7
FinSearchComp-global65.54260.842.6*63.9*29.229.5*26.2
AgentCompany36374139.3*/353034

Notes: Data points marked with an asterisk (*) are taken directly from the model's official tech report or blog. All other metrics were obtained using the evaluation methods described below.


Intelligence Benchmarks

We align with Artificial Analysis, which aggregates challenging benchmarks using a consistent methodology to reflect a model’s broader intelligence profile across math, science, instruction following, coding, and agentic tool use.

Metric (AA)MiniMax-M2Claude Sonnet 4Claude Sonnet 4.5Gemini 2.5 ProGPT-5 (thinking)GLM-4.6Kimi K2 0905DeepSeek-V3.2
AIME257874888894865788
MMLU-Pro8284888687838285
GPQA-Diamond7878838485787780
HLE (w/o tools)12.59.617.321.126.513.36.313.8
LiveCodeBench (LCB)8366718085706179
SciCode3640454343383138
IFBench7255574973434254
AA-LCR6165666676545269
τ²-Bench-Telecom8765785485717334
Terminal-Bench-Hard2430332531232329
AA Intelligence6157636069565057

AA: All scores of MiniMax-M2 aligned with Artificial Analysis Intelligence Benchmarking Methodology (https://artificialanalysis.ai/methodology/intelligence-benchmarking). All scores of other models reported from https://artificialanalysis.ai/.


Why activation size matters

By maintaining activations around 10B , the plan → act → verify loop in the agentic workflow is streamlined, improving responsiveness and reducing compute overhead:

In short: 10B activations = responsive agent loops + better unit economics.

At a glance

If you need frontier-style coding and agents without frontier-scale costs, MiniMax-M2 hits the sweet spot: fast inference speeds, robust tool-use capabilities, and a deployment-friendly footprint.

We look forward to your feedback and to collaborating with developers and researchers to bring the future of intelligent collaboration one step closer.

How to Use

Local Deployment Guide

Download the model from HuggingFace repository: https://huggingface.co/MiniMaxAI/MiniMax-M2. We recommend using the following inference frameworks (listed alphabetically) to serve the model:

SGLang

We recommend using SGLang to serve MiniMax-M2. SGLang provides solid day-0 support for MiniMax-M2 model. Please refer to our SGLang Deployment Guide for more details, and thanks so much for our collaboration with the SGLang team.

vLLM

We recommend using vLLM to serve MiniMax-M2. vLLM provides efficient day-0 support of MiniMax-M2 model, check https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html for latest deployment guide. We also provide our vLLM Deployment Guide.

MLX

We recommend using MLX-LM to serve MiniMax-M2. Please refer to our MLX Deployment Guide for more details.

Transformers

We recommend using Transformers to serve MiniMax-M2. Please refer to our Transformers Deployment Guide for more details.

Inference Parameters

We recommend using the following parameters for best performance: temperature=1.0, top_p = 0.95, top_k = 40.

IMPORTANT: MiniMax-M2 is an interleaved thinking model. Therefore, when using it, it is important to retain the thinking content from the assistant's turns within the historical messages. In the model's output content, we use the <think>...</think> format to wrap the assistant's thinking content. When using the model, you must ensure that the historical content is passed back in its original format. Do not remove the <think>...</think> part, otherwise, the model's performance will be negatively affected.

Tool Calling Guide

Please refer to our Tool Calling Guide.

Community Showcases

The projects below are built and maintained by the community/partners. They are not official MiniMax products, and results may vary.

Contact Us

Contact us at model@minimax.io | WeChat.

← all releases