AI inferencepriced live.
What every major model charges, ranked at your token mix. Every number on this page is fetched from LiteLLM on load. If LiteLLM doesn’t have a model, it isn’t shown.
What You Get
More than a simple price lookup.
Live rates, not last quarter's
Every price is fetched from LiteLLM on page load. If LiteLLM doesn't index a model, we don't display it. No static fallbacks. No made-up numbers.
Ranked at your token mix
Headline rates are misleading when output dominates your bill. Sort by per-request cost at your actual input/output ratio to see who's actually cheap for you.
Capabilities, sourced
Reasoning support, context window, and provider key all read from LiteLLM. The Source column shows exactly which provider key each price came from.
Shareable
Copy a link to share your exact configuration. Present the numbers to your team or stakeholders.
How It Works
- 01
Read the live price grid
Every model LiteLLM indexes - sortable by cheapest input, cheapest output, largest context, or cheapest for your token mix. Click any row to set it as primary.
- 02
Plug in your workload
Pick a scenario for typical token volumes, or type your own. Toggle up to 4 comparison models from the grid to see them side-by-side.
- 03
See where cost lives
Input vs output split, top-driver breakdown, and the cheapest alternative in your comparison set - all derived from your actual config and live prices.
Go Deeper
The thinking behind the numbers.
The Inference Tax
Why your GenAI budget is hiding 80% of its real cost. The infrastructure iceberg beneath every API call.
Read article BlogThe Hidden Tax of AI
What CFOs aren’t seeing in their AI investments. The costs that compound before anyone notices.
Read article ChecklistAI Cost Audit Checklist
30 questions every CFO and VP should ask before the next board meeting. Audit inference economics end-to-end.
Get the checklist
Spending more
than you should?
Let's find where your cloud and AI spend can work harder.