MiniMax: MiniMax 01

Minimax-01

MiniMax-01 integrates MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding, combining multimodal strengths in a single model. It features 456B parameters, with 45.9B active per inference, and supports context lengths of up to 4 million tokens.

The text component uses a hybrid architecture that blends Lightning Attention, Softmax Attention, and Mixture-of-Experts (MoE). The vision component follows a “ViT-MLP-LLM” framework, trained on top of the text model to enable advanced multimodal reasoning.

Conversations

Download TXT
Download PDF
CreatorMiniMax
Release DateJanuary, 2025
LicenseMiniMax Model License Agreement
Context Window1,000,192
Image Input SupportNo
Open Source (Weights)Yes
Parameters456B, 45.9B active at inference time
Model WeightsClick here

Performance Benchmarks

Core Academic Benchmarks

TasksGPT-4o (11-20)Claude-3.5-Sonnet (10-22)Gemini-1.5-Pro (002)Gemini-2.0-Flash (exp)Qwen2.5-72B-Inst.DeepSeek-V3Llama-3.1-405B-Inst.MiniMax-Text-01
General
MMLU*85.788.386.886.586.188.588.688.5
MMLU-Pro*74.478.075.876.471.175.973.375.7
SimpleQA39.028.123.426.610.324.923.223.7
C-SimpleQA64.656.859.463.352.264.854.767.4
IFEval (avg)84.190.189.488.487.287.386.489.1
Arena-Hard92.487.685.372.781.291.463.589.1
Reasoning
GPQA* (diamond)46.065.059.162.149.059.150.754.4
DROP* (F1)89.288.889.289.385.091.092.587.8
Mathematics
GSM8k*95.696.995.295.495.896.796.794.8
MATH*76.674.184.683.981.884.673.877.4
Coding
MBPP +76.275.175.475.977.078.873.071.7
HumanEval90.293.786.689.686.692.189.086.9

Ruler

Model4k8k16k32k64k128k256k512k1M
GPT-4o (11-20)0.9700.9210.8900.8880.884
Claude-3.5-Sonnet (10-22)0.9650.9600.9570.9500.9520.938
Gemini-1.5-Pro (002)0.9620.9600.9600.9580.9380.9170.9160.8610.850
Gemini-2.0-Flash (exp)0.9600.9600.9510.9570.9370.8600.7970.709
MiniMax-Text-010.9630.9610.9530.9540.9430.9470.9450.9280.910

LongBench V2

Modeloveralleasyhardshortmediumlong
Human53.7100.025.147.259.153.7
w/ CoT
GPT-4o (11-20)51.454.249.759.648.643.5
Claude-3.5-Sonnet (10-22)46.755.241.553.941.944.4
Deepseek-V3
Qwen2.5-72B-Inst.43.547.940.848.940.939.8
MiniMax-Text-0156.566.150.561.756.747.2
w/o CoT
GPT-4o (11-20)50.157.445.653.352.440.2
Claude-3.5-Sonnet (10-22)41.046.937.346.138.637.0
Deepseek-V348.7
Qwen2.5-72B-Inst.42.142.741.845.638.144.4
MiniMax-Text-0152.960.947.958.952.643.5

MTOB

Context Typeno contexthalf bookfull bookΔ half bookΔ full book
eng → kalam (ChrF)
GPT-4o (11-20)9.9054.3044.40
Claude-3.5-Sonnet (10-22)20.2253.6255.6533.3935.42
Gemini-1.5-Pro (002)16.7953.6857.9036.8941.11
Gemini-2.0-Flash (exp)12.2049.5053.3037.3041.10
Qwen-Long16.5548.4845.9431.9229.39
MiniMax-Text-016.051.7451.6045.745.6
kalam → eng (BLEURT)
GPT-4o (11-20)33.2058.3025.10
Claude-3.5-Sonnet (10-22)31.4259.7062.3028.2830.88
Gemini-1.5-Pro (002)32.0261.5263.0929.5031.07
Gemini-2.0-Flash (exp)33.8057.5057.0023.7023.20
Qwen-Long30.1353.1432.1523.012.02
MiniMax-Text-0133.6557.1058.0023.4524.35

Explore More AI Models