Skip to main content
Skip to main content

One post tagged with "Cost Optimization"

View All Tags
Token Debt: Why FinOps for Agentic AI Is an Engineering Problem, Not a Model Choice

Token Debt: Why FinOps for Agentic AI Is an Engineering Problem, Not a Model Choice

· 18 min read
David Sanchez
David Sanchez

Why the next chapter of FinOps is not about finding a cheaper model. It is about engineering systems that do not waste the tokens they already have.

A finance leader opens the monthly invoice for the company's AI platform and finds a number that does not match any story anyone can tell. Usage grew modestly. The bill grew sharply. Nobody switched to a pricier model. Nobody approved a new integration that anyone remembers. The line item simply grew on its own, the way cloud bills used to grow before anyone built a discipline around watching them.

Ask the engineering team what happened and the answer is rarely a single cause. It is a hundred small decisions: a system prompt that grew every time someone patched in a new rule, a retrieval step that fetches ten documents when two would do, an agent that retries a failing tool call five times before giving up, a workflow that hands a conversation between three specialized agents and resends the full history at every handoff. None of these decisions looked expensive in isolation. Together, they are the bill.

📬 Stay Updated

Subscribe to the newsletter and receive the latest blog posts, projects, and content updates.

We respect your privacy. Unsubscribe anytime. Privacy Policy

Ask me about my website

Powered by Microsoft Foundry

👋 Hello Friend!

You can ask me about:

  • Blog posts or technical articles.
  • Projects and contributions.
  • Gaming: Xbox, PlayStation, Switch, board games, chess, monthly updates.
  • Movies & TV reviews, About me & health journey.