How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)

By Pyro Cascade · April 5, 2026 · 1 min read

I was spending about $2,000/month on OpenAI and Anthropic APIs across a few projects. I knew some of it was wasteful. I just couldn't prove it. The provider dashboards show you one number — your total bill. That's like getting an electricity bill with no breakdown. Is it the AC? The lights? The server room? No idea. So I built a tool to find out. What it discovered was honestly embarrassing. What I found 34% of my summarizer calls were retries. The prompt asked for JSON, but the model kept wrapping it in markdown code blocks. My parser rejected it. The retry loop ran the same call again. And again. Each retry cost money. Total waste: about $140/month — from a six-word fix I could have made months ago. 85% of my classifier calls were duplicates. Same input, same output, full price every time. No caching. 723 of 847 weekly calls were completely redundant. A simple cache would have saved $310/month. My classifier was using GPT-4o for a yes/no task. The output was always under 10 tokens —

How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)

Related Posts

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network