See exactly what GPT-4o, Claude, Gemini, and 200+ models cost at official rates, then how much less you'd pay routing each request through doteb's cheapest provider.
Official (OpenAI)
$0.15/M in · $0.60/M out
doteb (OpenAI)
$0.15/M in · $0.60/M out
Official price
$0.2100
direct from providers
doteb
$0.2100
cheapest provider per model, no markup
You save
$0.00
same price, more features
Share these results
Send this breakdown to your team or post it in one click.
| model | Tokens | Official | doteb | Saved |
|---|---|---|---|---|
GPT-4o Mini via OpenAI | 1.0M in 100K out | $0.2100 | $0.2100 | — |
| Total | $0.2100 | $0.2100 | — | |
Smart Routing
Automatically routes to the cheapest available provider for each model
No Markup
Zero platform fees - you pay exactly what the provider charges
Automatic Failover
If one provider is down, requests automatically route to the next cheapest
Processing high volume?
Enterprise plans include volume discounts, dedicated support, custom SLAs, and extended data retention.
Start for free with no platform fees. No credit card required.
Estimate your monthly spend across any model in three steps, then see how much routing through doteb saves you.
Pick from 200+ LLMs across OpenAI, Anthropic, Google, Mistral, Meta, and more. Stack as many models as you want to compare side by side.
Set the input and output tokens you expect to process, or start from a Light, Medium, Heavy, or Intensive preset and adjust from there.
Instantly see official provider pricing next to doteb's cheapest routed provider, plus the exact amount you would save each cycle.
Every large language model bills by the token, the small chunks of text a model reads and writes. Roughly speaking, one token is about four characters of English, so 1,000 tokens is around 750 words. Providers quote prices per million tokens, and they charge separately for the tokens you send (input) and the tokens the model generates (output).
Output tokens are usually two to four times more expensive than input tokens, so the ratio between your prompt size and response size has a big impact on your bill. A summarization workload that reads a lot and writes a little costs very differently from a code-generation workload that writes long responses. The calculator above keeps the two separate so your estimate reflects how you actually use each model.
Prices also vary by provider. A single popular model is often hosted by several providers at different rates, and those rates change as providers compete on price. Instead of locking yourself into one provider, doteb routes each request to the cheapest available provider for that model through one OpenAI-compatible API, with no platform markup. That is the gap the calculator shows: the official list price versus the lowest live price you would actually pay.
Use it to budget a new feature, compare GPT-4o against Claude or Gemini before you commit, or build the business case for switching providers. When the numbers look good, you can start for free and keep the same estimate in production.
Everything you need to know about estimating and lowering your LLM token costs.
Input tokens are everything you send to the model, including your prompt, system message, and conversation history. Output tokens are what the model generates back. Output tokens almost always cost more than input tokens, which is why the split matters when you estimate spend.
Popular models are often served by several providers at different rates, and prices change as providers compete. doteb routes each request to the cheapest available provider for that model, so you pay the lowest live rate without changing any code.
No. doteb shows provider pricing without a platform markup. You pay for model usage and can route to the cheapest provider available for each request.
Route through a gateway that compares providers and picks the lowest price per request. Because doteb supports 200+ models behind one OpenAI-compatible API, you can switch models or providers based on cost without rewriting your integration.