Return Measure

AI & Tech Costs

Cost Per Million Tokens: What It Really Means for Your App

LLM API providers quote prices per million tokens. Translating that into actual monthly cost requires understanding what counts as a token.

Every major large-language-model provider — OpenAI, Anthropic, Google, and the open-source hosting services — prices their API the same way: dollars per million tokens of input, and a higher rate per million tokens of output. The pricing looks simple in the documentation. Translating it into a monthly bill for a real application takes some unobvious arithmetic.

What is a token, actually?

A token is roughly three-quarters of an English word. The phrase "Return Measure builds zero-friction calculators" is 6 words and roughly 8 tokens. A 500-word email is about 650 tokens. A typical web page worth of context is 1,500 to 3,000 tokens. A whole book chapter is 10,000 to 20,000 tokens.

For non-English languages, the ratio is different — sometimes much worse. Languages with rich morphology (German, Russian, Korean) often produce 30 to 50 percent more tokens per word than English. For some Asian languages, individual characters can be one or more tokens each, which dramatically inflates the cost.

Input versus output pricing

Output tokens are typically 3 to 5 times more expensive than input tokens. This matters because of how most LLM applications are structured: a long system prompt and conversation history goes in (lots of input tokens), and a short response comes out (few output tokens). Engineers new to LLM pricing often assume input cost dominates because there is so much more of it. In a typical chat application, that turns out to be roughly correct — but in a summarization or generation app, output cost dominates quickly.

Building a real monthly estimate

To estimate a monthly LLM bill honestly, you need four numbers:

  1. The average input tokens per request (system prompt plus any retrieved context plus the user message).
  2. The average output tokens per request.
  3. The number of requests per active user per month.
  4. The number of active users.

For example: a customer-support bot with a 2,000-token system prompt, an average 200-token user question, 400 tokens of output, 30 conversations per active user per month, and 5,000 active users. At GPT-4-class pricing (roughly $2.50 per million input tokens and $10 per million output tokens), the monthly cost is approximately $1,000.

Change two assumptions — bump the system prompt to 4,000 tokens because of a richer knowledge base, and add a second turn of conversation per session — and the same app costs roughly $2,700 a month.

The costs nobody mentions

Our AI Compute Cost Calculator models monthly LLM API spend across the major providers using these inputs.

Related tool

AI Compute Cost Calculator →

← Back to Learn