No, the LLM costs are not going up!

19 Apr, 2026

I often hear flat out false statements about the state of LLM economics, one of them being that the costs are "unsustainable". This is blatantly false and easily falsifiable with just a few minutes of thinking.

Of course I can never know what really goes on internally in companies like OpenAI and Anthropic but I feel I can make really good guesses. I'll lay out a few proxies that can help drive my point.

Note that I'm not really talking about profitability of these companies. I'm only speaking about the consumer costs and the real costs to run these models and their trend.

I also agree that the price to run the latest models always go up with time but this has a different meaning. I'm not focusing on this but rather I'm focusing on the price to achieve the same performance -- is it going up or down over time?

Cost per capability has been decreasing

EpochAi's analysis should drive the point very unambiguously that the price has gone down extensively.

There ..., which could mean that trends in price per token differ from trends in evaluation cost. However, our investigations found that evaluation costs have declined similarly to prices per token. We used Epoch AI benchmark evaluation results for this analysis, as these report the number of tokens processed for each evaluation. For the six cost trends we estimated, the largest difference was 200x per year for evaluation cost vs. 400x per year for price per token, for Claude-3.5-Sonnet-2024-06 level performance on GPQA Diamond.

The price to achieve a fixed "capability" (like solving a certain benchmark) has gone down year over year by 9x to 900x depending on the benchmark. Note that here we are fixing the capability and not the tokens. The new model or the old model may use different number of tokens but the ultimate total price has decreased.

A somewhat fair challenge to this is: do we know that the API prices are subsidised or not? I will cover this later. Spoiler alert: no they are not.

The open weight model phenomenon

There are lots of open weight models hosted in openrouter.ai. A valid assumption can be made here: the models hosted here by independent providers are not subsidised. They have no incentive to do so because there is literally no real moat. Their pricing is as close as you can get to true cost of hosting the models.

Go to this Deepseek Openrouter link and check out all the providers. This is the actual pricing of this model: 0.27 / 1M tokens input and $1.10 / 1M output. This is substantially lower than GPT 4o which is $1.25 / 1M input vs $10.00 / 1M output. Benchmarks show that Deepseek v3.2 is unambigiously better than 4o.

ARC AGI

ARC AGI benchmark is a very good source to understand how the prices have gone down for this specific benchmark. If you just see the ARC AGI 2 benchmark (already saturated) you observe that new models beat the old models at performance as well as cost.

As an example: GPT 5.2 pro has a score of ~52% at ~$15. But GPT 5.4-medium has a score of 51.9% at around $0.68. That's a 20x reduction for almost the same performance! You can check it out yourself to verify that I'm not cherry-picking data points.

The Gemma 4 31B phenomenon

I won't present any facts or evidence here but purely vibes because vibes are more than enough to convince you at this point. I can run this new open model Gemma 4 31B model on my laptop today. It completely outshines GPT 4o in almost every single benchmark I care about. And I'm not only talking about benchmarks here - just from the vibe basis itself it is clear that Gemma 4 31B is much more useful and more intelligent. Just give it a try if you can!

A year ago you would have never thought of being able to run something so powerful like 4o on your laptop, but here we are. If this is not a signal that costs are reducing, what is?

Conclusion

I have shown that using public API prices, the prices are going down. I also showed that they are not subsidised using Deepseek as an example. I also showed another benchmark ARC AGI 2 which shows prices going down ~22x. Finally I'm asking you to try Gemma 4 because it is quite honestly impressive!

After all this, it must be really clear that the fundamental cost to run the models are going down. This shouldn't come as a surprise to anyone who has observed any technology and how their prices trend. Almost all technologies go down in price over time.