AI inferencepriced live.

What every major model charges, ranked at your token mix. Every number on this page is fetched from LiteLLM on load. If LiteLLM doesn’t have a model, it isn’t shown.

What You Get

More than a simple price lookup.

Live rates, not last quarter's

Every price is fetched from LiteLLM on page load. If LiteLLM doesn't index a model, we don't display it. No static fallbacks. No made-up numbers.

Ranked at your token mix

Headline rates are misleading when output dominates your bill. Sort by per-request cost at your actual input/output ratio to see who's actually cheap for you.

Capabilities, sourced

Reasoning support, context window, and provider key all read from LiteLLM. The Source column shows exactly which provider key each price came from.

Shareable

Copy a link to share your exact configuration. Present the numbers to your team or stakeholders.

How It Works

  1. 01

    Read the live price grid

    Every model LiteLLM indexes - sortable by cheapest input, cheapest output, largest context, or cheapest for your token mix. Click any row to set it as primary.

  2. 02

    Plug in your workload

    Pick a scenario for typical token volumes, or type your own. Toggle up to 4 comparison models from the grid to see them side-by-side.

  3. 03

    See where cost lives

    Input vs output split, top-driver breakdown, and the cheapest alternative in your comparison set - all derived from your actual config and live prices.

[ - / CTA ]

Spending more
than you should?

Let's find where your cloud and AI spend can work harder.