Leaderboard

Best AI for Math 2026.

Find the best AI for math. Ranked by AIME 2025, MATH-500, FrontierMath, and GPQA Diamond. Compare mathematical reasoning across all major LLMs.

LLM Image Video Speech Transcription


M Kimi K2-Thinking-0905OSS Moonshot AI	—	—	—	100%	—	—	—
gpt-5.2-pro OpenAI	87.9%	—	8.7%	100%	—	—	100%
gpt-5.2 OpenAI	85.4%	40.3%	7.6%	100%	—	—	—
Gemini 3 Pro Preview Google	91.9%	—	—	100%	—	—	23.4%
Claude Opus 4.6 Anthropic	—	—	—	99.8%	—	—	—
Gemini 3 Flash Preview Google	—	—	—	99.7%	—	—	—
M LongCat-Flash-Thinking-2601OSS Meituan	—	—	—	99.6%	—	—	—
GPT-5.1 High OpenAI	—	—	—	99.6%	—	—	—
N Nemotron 3 Nano (30B A3B)OSS NVIDIA	—	—	—	99.2%	—	—	—
Kimi K2 Thinking Moonshot AI	84.5%	—	—	99.1%	—	—	—
GPT OSS 20B HighOSS OpenAI	—	—	—	98.7%	—	—	—
o3 OpenAI	83.3%	15.8%	0%	98.4%	91.6%	—	—
GPT-5.1 Medium OpenAI	—	—	—	98.4%	—	—	—
S Step-3.5-FlashOSS StepFun	—	—	—	97.3%	—	—	—
GPT-5.1 Codex High OpenAI	—	—	—	96.7%	—	—	—
Kimi K2.5OSS Moonshot AI	—	—	—	96.1%	—	—	—
GLM-4.7OSS Z AI	—	—	—	95.7%	—	—	84.9%
GPT-5 High OpenAI	—	—	—	94.6%	—	—	—
gpt-5 OpenAI	87.3%	26.3%	—	94.6%	—	—	99.6%
X MiMo-V2-FlashOSS Xiaomi	—	—	—	94.1%	—	—	—
GPT-5.1 Thinking OpenAI	—	26.7%	—	94%	—	—	—
GPT-5.1 Instant OpenAI	—	26.7%	—	94%	—	—	—
gpt-5.1 OpenAI	88.1%	26.7%	—	94%	—	—	94%
GLM-4.6OSS Z AI	—	—	—	93.9%	—	—	—
Grok 3 xAI	84.6%	—	—	93.3%	93.3%	—	—
DeepSeek V3.2OSS DeepSeek	—	—	—	93.1%	—	—	—
L K-EXAONE-236B-A23B LG AI Research	—	—	—	92.8%	—	—	—
o4-mini OpenAI	81.4%	—	—	92.7%	93.4%	—	—
GPT OSS 120B HighOSS OpenAI	—	—	—	92.5%	—	—	—
Qwen3-235B-A22B-Thinking-2507OSS Qwen	—	—	—	92.3%	—	—	—
Grok 4 Fast xAI	—	—	—	92%	—	—	—
Grok 4 xAI	87.5%	—	—	91.7%	94%	—	—
GLM-4.7-FlashOSS Z AI	—	—	—	91.6%	—	—	—
I Mercury 2 Inception	—	—	—	91.1%	—	—	—
gpt-5-mini OpenAI	—	22.1%	—	91.1%	—	—	22.1%
Grok 3 Mini xAI	—	—	—	90.8%	—	—	95.8%
M LongCat-Flash-ThinkingOSS Meituan	—	—	—	90.6%	—	—	—
Qwen3 VL 235B A22B ThinkingOSS Qwen	—	—	—	89.7%	—	—	—
DeepSeek-V3.2-ExpOSS DeepSeek	—	—	—	89.3%	—	—	—
GPT-5 Medium OpenAI	—	—	—	88.9%	—	—	—
Gemini 2.5 Pro Preview 06-05 Google	—	—	—	88%	—	—	—
Qwen3-Next-80B-A3B-ThinkingOSS Qwen	—	—	—	87.8%	—	—	—
DeepSeek-R1-0528OSS DeepSeek	—	—	—	87.5%	—	—	—
Claude Sonnet 4.5 Anthropic	83.4%	—	—	87%	—	—	—
Mistral Medium 3.5OSS Mistral	—	—	—	86.3%	—	—	—
gpt-5-nano OpenAI	—	9.6%	—	85.2%	—	—	—
Ministral 3 (14B Reasoning 2512)OSS Mistral	—	—	—	85%	—	—	—
Mistral Small 4OSS Mistral	—	—	—	83.8%	—	—	—
Qwen3 VL 30B A3B ThinkingOSS Qwen	—	—	—	83.1%	—	—	—
Gemini 2.5 Pro Google	86.4%	—	—	83%	92%	—	97.1%

Showing 1–50 of 382 models

Building with these APIs?

Get 10+ Next.js AI templates with auth, payments, and more.

Get Templates — $249

All Large Language Models

OpenAI

70 models

GPT-5.6 Luna·GPT-5.6 Sol·GPT-5.6 Terra·gpt-5.5-instant·gpt-5.5·gpt-5.5-pro·gpt-5.4-mini·gpt-5.4-nano·gpt-5.4·gpt-5.4-pro·gpt-5.3-chat-latest·gpt-5.3-instant·gpt-5.3-codex·GPT-5.2 Codex·gpt-5.2·gpt-5.1-codex·gpt-5.1-codex-mini·GPT-5.1 Codex High·GPT-5.1 High·GPT-5.1 Instant·GPT-5.1 Medium·GPT-5.1 Thinking·gpt-5-codex·gpt-5.2-pro·gpt-5·GPT-5 High·GPT-5 Medium·gpt-5-mini·gpt-5-nano·GPT OSS 120B·GPT OSS 120B High·GPT OSS 20B·GPT OSS 20B High·o3-pro·o3·o4-mini·gpt-4.1·gpt-4.1-mini·gpt-4.1-nano·o1-pro·gpt-4.5-preview·o3-mini·o1·gpt-5-pro·gpt-5.1·o1 mini·o1 preview·o1-mini-2024-09-12·o1-preview-2024-09-12·GPT-4o-2024-08-06·GPT-4o-mini·GPT-4o·GPT-4o-2024-05-13·GPT-4-turbo-2024-04-09·GPT-3.5-turbo-0125·GPT-4-turbo-0125·GPT-3.5-turbo-1106·GPT-4 Turbo·GPT-4-turbo-1106·computer-use-preview·gpt-4o-audio-preview·GPT-3.5-turbo-16k·GPT-4-0613·GPT-4-32k-0613·GPT-3.5 Turbo·GPT-4·GPT-4-32k·GPT-3.5-turbo·gpt-5.5-chat-latest·gpt-realtime-2

Cursor

2 models

Composer 2·Composer 2 (Fast)

Perplexity

5 models

Llama 3.1 70b·Sonar·R1 1776·Sonar Deep Research·Sonar Pro

DeepSeek

15 models

DeepSeek-V4-Flash-Max·DeepSeek-V4-Pro-Max·DeepSeek V3.2·deepseek-chat·DeepSeek-V3.2-Exp·DeepSeek-R1-0528·DeepSeek-V3 0324·DeepSeek R1·DeepSeek R1 Distill Llama 70B·DeepSeek R1 Distill Qwen 32B·DeepSeek V3.1·DeepSeek V3·DeepSeek-V2.5·DeepSeek V2·deepseek-coder

Groq

9 models

Llama 4 Maverick (Groq)·Llama 4 Scout (Groq)·DeepSeek R1 (Groq)·Llama 3.3 70B (Groq)·Llama 3.1 70b·Llama 3.1 8B (Groq)·Mixtral 8x7B·Mistral Small 2503 (Groq)·Qwen QwQ 32B (Groq)

MiniMax

5 models

MiniMax M2.7·MiniMax M2.5·MiniMax M2.1·MiniMax M2·MiniMax M1 80K

01.ai

1 models

AI21 Labs

2 models

Jamba 1.5 Large·Jamba 1.5 Mini

Amazon

10 models

Amazon Nova 2 Lite·Amazon Nova Lite·Amazon Nova 2 Omni·Amazon Nova Premier·Amazon Nova Micro·Amazon Nova Pro·Nova Lite·Nova Micro·Nova Pro·Amazon Nova 2 Pro

Anyscale

2 models

Llama 3 70b·Mixtral 8x7B

B

Baidu

1 models

Cerebras

5 models

Llama 3.3 70B (Cerebras)·DeepSeek R1 (Cerebras)·DeepSeek V3 (Cerebras)·Llama 3.1 8B (Cerebras)·Qwen 2.5 32B (Cerebras)

Cohere

4 models

Command R7B·Command R·Command R+·Command A

DeepInfra

6 models

Llama 3.3 70B (DeepInfra)·Llama 3.1 405B (DeepInfra)·DeepSeek R1 (DeepInfra)·DeepSeek V3 (DeepInfra)·Mistral Small 2501 (DeepInfra)·Qwen 2.5 72B (DeepInfra)

Fireworks

10 models

DeepSeek V3.2 (Fireworks)·DeepSeek R1 (Fireworks)·Llama 4 Maverick (Fireworks)·Llama 3.1 405b·Llama 3.1 405B (Fireworks)·Llama 3.1 70b·Yi-Large·Mixtral 8x7B·Kimi K2.5 (Fireworks)·Llama 4 Scout (Fireworks)

Hyperbolic

4 models

Llama 3.1 405B (Hyperbolic)·DeepSeek R1 (Hyperbolic)·DeepSeek V3 (Hyperbolic)·Qwen 3 235B A22B (Hyperbolic)

IBM

1 models

Granite 3.3 8B Instruct

IBM Watsonx

6 models

Granite 3.1 8B Instruct·Llama 3.1 405b·Llama 3.1 70b·Mistral Large·Granite 3.1 2B Instruct·Granite 3.2 8B Instruct

I

Inception

1 models

i

inclusionAI

1 models

L

LG AI Research

1 models

K-EXAONE-236B-A23B

M

Meituan

4 models

LongCat-Flash-Lite·LongCat-Flash-Thinking-2601·LongCat-Flash-Thinking·LongCat-Flash-Chat

Meta

13 models

Llama 4 Maverick·Llama 4 Scout·Llama 3.3 70B·Llama 3.3 70B Instruct·Llama 3.2 11B Instruct·Llama 3.2 3B Instruct·Llama 3.2 90B Instruct·Llama 3.1 405B·Llama 3.1 405B Instruct·Llama 3.1 70B·Llama 3.1 70B Instruct·Llama 3.1 8B·Llama 3.1 8B Instruct

M

Microsoft

3 models

Phi-4-multimodal-instruct·Phi 4·Phi-3.5-mini-instruct

Moonshot AI

7 models

Kimi K3·Kimi K2.6·Kimi K2.5·Kimi K2 0905·Kimi K2-Thinking-0905·Kimi K2 Instruct·Kimi K2 Thinking

Nebius AI Studio

5 models

Qwen 3 235B A22B (Nebius)·DeepSeek V3 (Nebius)·DeepSeek R1 (Nebius)·Llama 3.1 405B (Nebius)·Llama 3.1 70B (Nebius)

N

Nous Research

1 models

N

NVIDIA

1 models

Nemotron 3 Nano (30B A3B)

Reka

3 models

Reka Core·Reka Edge·Reka Flash

Replicate

5 models

DeepSeek R1 (Replicate)·Llama 3.1 405b·Llama 3.1 405B (Replicate)·Llama 3.1 70b·Mixtral 8x7B

SambaNova Cloud

5 models

Llama 3.3 70B (SambaNova)·Qwen 2.5 72B (SambaNova)·Llama 3.1 405B (SambaNova)·DeepSeek R1 (SambaNova)·DeepSeek V3 (SambaNova)

S

StepFun

1 models

Together.AI

10 models

Llama 4 Maverick (Together)·Llama 4 Scout (Together)·DeepSeek R1 (Together)·Llama 3.3 70B (Together)·Llama 3.1 405b·Llama 3.1 405B (Together)·Llama 3.1 70b·Mixtral 8x7B·DeepSeek V3 (Together)·Qwen 2.5 72B (Together)

Writer

8 models

Palmyra X 004·Palmyra Fin 32k·Palmyra Med 32k·Palmyra X 002·Palmyra X 002 32k·Palmyra X 003·Palmyra X 32k·Palmyra X5

X

Xiaomi

1 models

Z AI

9 models

GLM-5.1·GLM-5·GLM-5 Code·GLM-4.7-Flash·GLM-4.7·GLM-4.6·GLM-4.5V·GLM-4.5·GLM-4.5 Flash

Best AI for Math 2026 — LLM Math Leaderboard & Rankings | AnotherWrapper