How I Found $1,240/Month in Wasted LLM API Costs (And Built a Tool to Find Yours)
I was spending about $2,000/month on OpenAI and Anthropic APIs across a few projects. I knew some of it was wasteful. I just couldn't prove it. The provider dashboards show you one number — your to...

Source: DEV Community
I was spending about $2,000/month on OpenAI and Anthropic APIs across a few projects. I knew some of it was wasteful. I just couldn't prove it. The provider dashboards show you one number — your total bill. That's like getting an electricity bill with no breakdown. Is it the AC? The lights? The server room? No idea. So I built a tool to find out. What it discovered was honestly embarrassing. What I found 34% of my summarizer calls were retries. The prompt asked for JSON, but the model kept wrapping it in markdown code blocks. My parser rejected it. The retry loop ran the same call again. And again. Each retry cost money. Total waste: about $140/month — from a six-word fix I could have made months ago. 85% of my classifier calls were duplicates. Same input, same output, full price every time. No caching. 723 of 847 weekly calls were completely redundant. A simple cache would have saved $310/month. My classifier was using GPT-4o for a yes/no task. The output was always under 10 tokens —