Leaderboard

    Best AI for Agentic Tasks 2026.

    Find the best AI for agentic tasks and tool use. Ranked by OSWorld, ToolAthlon, MCP Atlas, tau-bench, BrowseComp, and more agent benchmarks.

    Claude Opus 4.6
    Anthropic
    84%โ€”72.7%โ€”โ€”59.5%โ€”โ€”โ€”
    Claude Sonnet 4.6
    Anthropic
    74.7%โ€”72.5%โ€”โ€”61.3%โ€”โ€”โ€”
    Qwen3 VL 235B A22B InstructOSS
    Qwen
    โ€”62%66.7%โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Opus 4.5
    Anthropic
    โ€”โ€”66.3%โ€”โ€”62.3%โ€”โ€”โ€”
    Claude Sonnet 4.5
    Anthropic
    โ€”โ€”61.4%โ€”โ€”โ€”86.2%โ€”โ€”
    Claude Haiku 4.5
    Anthropic
    โ€”โ€”50.7%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 235B A22B ThinkingOSS
    Qwen
    โ€”61.8%38.1%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 8B ThinkingOSS
    Qwen
    โ€”46.6%33.9%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 8B InstructOSS
    Qwen
    โ€”54.6%33.9%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 4B ThinkingOSS
    Qwen
    โ€”49.2%31.4%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 30B A3B ThinkingOSS
    Qwen
    โ€”57.3%30.6%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 30B A3B InstructOSS
    Qwen
    โ€”60.5%30.3%โ€”โ€”โ€”โ€”โ€”โ€”
    Qwen3 VL 4B InstructOSS
    Qwen
    โ€”59.5%26.2%โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova 2 Lite
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova 2 Omni
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova 2 Pro
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova Lite
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova Micro
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova Premier
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Amazon Nova Pro
    Amazon
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”68.4%
    Claude 2
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude 2.1
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude 3 Haiku
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude 3 Opus
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude 3 Sonnet
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude 3.5 Haiku
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”51%โ€”54.3%
    Claude 3.5 Sonnet
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”69.2%โ€”56.5%
    Claude Haiku 3
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Haiku 3.5
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Instant 1.2
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Opus 3
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Opus 4
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”81.4%โ€”โ€”
    Claude Opus 4.1
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”82.4%โ€”โ€”
    Claude Sonnet 3.5
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Claude Sonnet 3.7
    Anthropic
    โ€”โ€”โ€”โ€”โ€”โ€”81.2%โ€”58.3%
    Claude Sonnet 4
    Anthropic
    โ€”โ€”โ€”38.6%โ€”โ€”80.5%โ€”โ€”
    Codestral
    Mistral
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Command A
    Cohere
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Command R
    Cohere
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Command R+OSS
    Cohere
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Command R7B
    Cohere
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    computer-use-preview
    OpenAI
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1OSS
    DeepSeek
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”57.5%
    DeepSeek R1 (Cerebras)
    Cerebras
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (DeepInfra)
    DeepInfra
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (Fireworks)
    Fireworks
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (Groq)
    Groq
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (Hyperbolic)
    Hyperbolic
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (Nebius)
    Nebius AI Studio
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    DeepSeek R1 (Replicate)
    Replicate
    โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”โ€”
    Showing 1โ€“50 of 336 models

    Building with these APIs?

    Get 10+ Next.js AI templates with auth, payments, and more.

    Get Templates โ€” $249

    All Large Language Models

    Perplexity

    5 models

    MiniMax

    4 models

    01.ai

    1 models

    AI21 Labs

    2 models

    Anyscale

    2 models

    Baidu

    1 models

    Cohere

    4 models

    Inception

    1 models

    LG AI Research

    1 models

    Nous Research

    1 models

    Reka

    3 models

    StepFun

    1 models

    Xiaomi

    1 models

    Z AI

    8 models