Leaderboard

    Best AI for Math 2026.

    Find the best AI for math. Ranked by AIME 2025, MATH-500, FrontierMath, and GPQA Diamond. Compare mathematical reasoning across all major LLMs.

    M
    Kimi K2-Thinking-0905OSS
    Moonshot AI
    โ€”โ€”โ€”100%โ€”โ€”โ€”
    gpt-5.2-pro
    OpenAI
    87.9%โ€”8.7%100%โ€”โ€”100%
    gpt-5.2
    OpenAI
    85.4%40.3%7.6%100%โ€”โ€”โ€”
    Gemini 3 Pro Preview
    Google
    91.9%โ€”โ€”100%โ€”โ€”23.4%
    Claude Opus 4.6
    Anthropic
    โ€”โ€”โ€”99.8%โ€”โ€”โ€”
    Gemini 3 Flash Preview
    Google
    โ€”โ€”โ€”99.7%โ€”โ€”โ€”
    M
    LongCat-Flash-Thinking-2601OSS
    Meituan
    โ€”โ€”โ€”99.6%โ€”โ€”โ€”
    GPT-5.1 High
    OpenAI
    โ€”โ€”โ€”99.6%โ€”โ€”โ€”
    N
    Nemotron 3 Nano (30B A3B)OSS
    NVIDIA
    โ€”โ€”โ€”99.2%โ€”โ€”โ€”
    Kimi K2 Thinking
    Moonshot AI
    84.5%โ€”โ€”99.1%โ€”โ€”โ€”
    GPT OSS 20B HighOSS
    OpenAI
    โ€”โ€”โ€”98.7%โ€”โ€”โ€”
    o3
    OpenAI
    83.3%15.8%0%98.4%91.6%โ€”โ€”
    GPT-5.1 Medium
    OpenAI
    โ€”โ€”โ€”98.4%โ€”โ€”โ€”
    S
    Step-3.5-FlashOSS
    StepFun
    โ€”โ€”โ€”97.3%โ€”โ€”โ€”
    GPT-5.1 Codex High
    OpenAI
    โ€”โ€”โ€”96.7%โ€”โ€”โ€”
    Kimi K2.5OSS
    Moonshot AI
    โ€”โ€”โ€”96.1%โ€”โ€”โ€”
    GLM-4.7OSS
    Z AI
    โ€”โ€”โ€”95.7%โ€”โ€”84.9%
    GPT-5 High
    OpenAI
    โ€”โ€”โ€”94.6%โ€”โ€”โ€”
    gpt-5
    OpenAI
    87.3%26.3%โ€”94.6%โ€”โ€”99.6%
    X
    MiMo-V2-FlashOSS
    Xiaomi
    โ€”โ€”โ€”94.1%โ€”โ€”โ€”
    GPT-5.1 Thinking
    OpenAI
    โ€”26.7%โ€”94%โ€”โ€”โ€”
    GPT-5.1 Instant
    OpenAI
    โ€”26.7%โ€”94%โ€”โ€”โ€”
    gpt-5.1
    OpenAI
    88.1%26.7%โ€”94%โ€”โ€”94%
    GLM-4.6OSS
    Z AI
    โ€”โ€”โ€”93.9%โ€”โ€”โ€”
    Grok 3
    xAI
    84.6%โ€”โ€”93.3%93.3%โ€”โ€”
    L
    K-EXAONE-236B-A23B
    LG AI Research
    โ€”โ€”โ€”92.8%โ€”โ€”โ€”
    o4-mini
    OpenAI
    81.4%โ€”โ€”92.7%93.4%โ€”โ€”
    GPT OSS 120B HighOSS
    OpenAI
    โ€”โ€”โ€”92.5%โ€”โ€”โ€”
    Qwen3-235B-A22B-Thinking-2507OSS
    Qwen
    โ€”โ€”โ€”92.3%โ€”โ€”โ€”
    Grok 4 Fast
    xAI
    โ€”โ€”โ€”92%โ€”โ€”โ€”
    Grok 4
    xAI
    87.5%โ€”โ€”91.7%94%โ€”โ€”
    GLM-4.7-FlashOSS
    Z AI
    โ€”โ€”โ€”91.6%โ€”โ€”โ€”
    I
    Mercury 2
    Inception
    โ€”โ€”โ€”91.1%โ€”โ€”โ€”
    gpt-5-mini
    OpenAI
    โ€”22.1%โ€”91.1%โ€”โ€”22.1%
    Grok 3 Mini
    xAI
    โ€”โ€”โ€”90.8%โ€”โ€”95.8%
    M
    LongCat-Flash-ThinkingOSS
    Meituan
    โ€”โ€”โ€”90.6%โ€”โ€”โ€”
    Qwen3 VL 235B A22B ThinkingOSS
    Qwen
    โ€”โ€”โ€”89.7%โ€”โ€”โ€”
    DeepSeek-V3.2-ExpOSS
    DeepSeek
    โ€”โ€”โ€”89.3%โ€”โ€”โ€”
    GPT-5 Medium
    OpenAI
    โ€”โ€”โ€”88.9%โ€”โ€”โ€”
    Gemini 2.5 Pro Preview 06-05
    Google
    โ€”โ€”โ€”88%โ€”โ€”โ€”
    Qwen3-Next-80B-A3B-ThinkingOSS
    Qwen
    โ€”โ€”โ€”87.8%โ€”โ€”โ€”
    DeepSeek-R1-0528OSS
    DeepSeek
    โ€”โ€”โ€”87.5%โ€”โ€”โ€”
    Claude Sonnet 4.5
    Anthropic
    83.4%โ€”โ€”87%โ€”โ€”โ€”
    gpt-5-nano
    OpenAI
    โ€”9.6%โ€”85.2%โ€”โ€”โ€”
    Ministral 3 (14B Reasoning 2512)OSS
    Mistral
    โ€”โ€”โ€”85%โ€”โ€”โ€”
    Qwen3 VL 30B A3B ThinkingOSS
    Qwen
    โ€”โ€”โ€”83.1%โ€”โ€”โ€”
    Gemini 2.5 Pro
    Google
    86.4%โ€”โ€”83%92%โ€”97.1%
    Qwen3 Max
    Qwen
    โ€”โ€”โ€”81.6%โ€”โ€”โ€”
    Qwen 3 235B A22BOSS
    Qwen
    โ€”โ€”โ€”81.5%โ€”โ€”โ€”
    MiniMax M2.1OSS
    MiniMax
    โ€”โ€”โ€”81%โ€”โ€”โ€”
    Showing 1โ€“50 of 336 models

    Building with these APIs?

    Get 10+ Next.js AI templates with auth, payments, and more.

    Get Templates โ€” $249

    All Large Language Models

    Perplexity

    5 models

    MiniMax

    4 models

    01.ai

    1 models

    AI21 Labs

    2 models

    Anyscale

    2 models

    Baidu

    1 models

    Cohere

    4 models

    Inception

    1 models

    LG AI Research

    1 models

    Nous Research

    1 models

    Reka

    3 models

    StepFun

    1 models

    Xiaomi

    1 models

    Z AI

    8 models