Leaderboard

    Best AI for Writing 2026.

    Find the best AI for writing. Ranked by HellaSwag, MMLU, SimpleQA, and language understanding benchmarks. Compare writing and language quality across all major LLMs.

    Claude 3.5 Sonnet
    Anthropic
    โ€”67.2%โ€”โ€”90.4%95.4%
    Claude 3 Opus
    Anthropic
    โ€”50.4%โ€”โ€”88.2%95.4%
    GPT-4-32k-0613
    OpenAI
    โ€”โ€”โ€”โ€”86.4%95.3%
    GPT-4-32k
    OpenAI
    โ€”โ€”โ€”โ€”86.4%95.3%
    GPT-4-0613
    OpenAI
    โ€”โ€”โ€”โ€”86.4%95.3%
    GPT-4
    OpenAI
    โ€”35.7%โ€”โ€”86.4%95.3%
    GPT-4o-2024-05-13
    OpenAI
    โ€”โ€”โ€”โ€”88.7%95%
    GPT-4o
    OpenAI
    5.3%70.1%38.2%81.4%88.7%95%
    Gemini 1.5 Pro
    Google
    โ€”โ€”โ€”โ€”81.9%93.3%
    Claude 2
    Anthropic
    โ€”โ€”โ€”โ€”78.5%92.3%
    Mistral Large 2411
    Mistral
    โ€”โ€”โ€”โ€”84%92%
    Claude Sonnet 4
    Anthropic
    โ€”75.4%โ€”86.5%84%91.4%
    Gemini 1.5 Flash
    Google
    โ€”51%โ€”โ€”29.2%91.2%
    Claude 2.1
    Anthropic
    โ€”โ€”โ€”โ€”78.5%91%
    Gemini 2.5 Pro
    Google
    17.8%83%50.8%89.2%88.9%90.9%
    Claude Sonnet 3.5
    Anthropic
    โ€”โ€”โ€”โ€”88.7%90.5%
    Llama 3 70bOSS
    Anyscale
    โ€”โ€”โ€”โ€”82%89.5%
    Mistral LargeOSS
    IBM Watsonx
    โ€”โ€”โ€”โ€”81.2%89.2%
    Mistral LargeOSS
    Mistral
    โ€”โ€”โ€”โ€”81.2%89.2%
    Claude 3 Sonnet
    Anthropic
    โ€”40.4%โ€”โ€”81.5%89%
    DeepSeek V3.2 (Fireworks)
    Fireworks
    โ€”โ€”โ€”โ€”87.1%88.9%
    DeepSeek V3 (Hyperbolic)
    Hyperbolic
    โ€”โ€”โ€”โ€”88.5%88.9%
    Llama 3.1 405B (DeepInfra)
    DeepInfra
    โ€”โ€”โ€”โ€”87%88.5%
    Llama 3.1 405B (Together)
    Together.AI
    โ€”โ€”โ€”โ€”87%88.3%
    Llama 3.1 405B (SambaNova)
    SambaNova Cloud
    โ€”โ€”โ€”โ€”85.2%88.3%
    Llama 3.1 405B (Replicate)
    Replicate
    โ€”โ€”โ€”โ€”87%88.3%
    Llama 3.1 405B (Nebius)
    Nebius AI Studio
    โ€”โ€”โ€”โ€”87%88.3%
    Llama 3.1 405B (Fireworks)
    Fireworks
    โ€”โ€”โ€”โ€”87.4%88.3%
    Llama 3.1 405bOSS
    Fireworks
    โ€”โ€”โ€”โ€”85.2%88.3%
    Llama 3.1 405bOSS
    IBM Watsonx
    โ€”โ€”โ€”โ€”85.2%88.3%
    Llama 3.1 405bOSS
    Together.AI
    โ€”โ€”โ€”โ€”85.2%88.3%
    GPT-4-turbo-1106
    OpenAI
    โ€”โ€”โ€”โ€”84.7%88.3%
    Qwen 2.5 72B (SambaNova)
    SambaNova Cloud
    โ€”โ€”โ€”โ€”84.2%87.5%
    Gemini 1.5 Flash 8B
    Google
    โ€”38.4%โ€”โ€”77.2%87.1%
    Gemini 1.5 Flash
    Google
    โ€”โ€”โ€”โ€”77.2%87.1%
    Llama 3.1 70B (Nebius)
    Nebius AI Studio
    โ€”โ€”โ€”โ€”79.3%86.7%
    Llama 3.1 70bOSS
    IBM Watsonx
    โ€”โ€”โ€”โ€”83.6%86.7%
    Llama 3.1 70bOSS
    Together.AI
    โ€”โ€”โ€”โ€”83.6%86.7%
    Llama 3.1 70bOSS
    Replicate
    โ€”โ€”โ€”โ€”83.6%86.7%
    GPT-4 Turbo
    OpenAI
    โ€”48%โ€”โ€”64.3%86.6%
    Llama 3.3 70B (Groq)
    Groq
    โ€”โ€”โ€”โ€”86%86.5%
    Llama 3.3 70B
    Meta
    โ€”โ€”โ€”โ€”86%86.5%
    Llama 3.1 70B
    Meta
    โ€”โ€”โ€”โ€”79.3%86.5%
    Llama 3.3 70B (Together)
    Together.AI
    โ€”โ€”โ€”โ€”86%86.4%
    Claude 3 Haiku
    Anthropic
    โ€”33.3%โ€”โ€”76.7%85.9%
    GPT-3.5-turbo-16k
    OpenAI
    โ€”โ€”โ€”โ€”70%85.5%
    GPT-3.5-turbo-0125
    OpenAI
    โ€”โ€”โ€”โ€”70%85.5%
    GPT-3.5-turbo
    OpenAI
    โ€”โ€”โ€”โ€”70%85.5%
    GPT-3.5 Turbo
    OpenAI
    โ€”30.8%โ€”โ€”70%85.5%
    Llama 4 Scout (Fireworks)
    Fireworks
    โ€”โ€”โ€”โ€”79.6%85.3%
    Showing 1โ€“50 of 336 models

    Building with these APIs?

    Get 10+ Next.js AI templates with auth, payments, and more.

    Get Templates โ€” $249

    All Large Language Models

    Perplexity

    5 models

    MiniMax

    4 models

    01.ai

    1 models

    AI21 Labs

    2 models

    Anyscale

    2 models

    Baidu

    1 models

    Cohere

    4 models

    Inception

    1 models

    LG AI Research

    1 models

    Nous Research

    1 models

    Reka

    3 models

    StepFun

    1 models

    Xiaomi

    1 models

    Z AI

    8 models