Most companies are flying blind on AI spending as token billing spreads

The blindness goes mainstream

The uncomfortable part of the current AI spending story is not that bills are rising. It is that many companies still cannot reliably see why they are rising.

According to a yet-unpublished KPMG survey cited by the Wall Street Journal and summarized by The Decoder, only 26% of companies have full visibility into their AI costs. Another 50% report limited oversight, while 22% say they have no transparency at all — or only learn what they used after the invoice arrives. That is not a niche governance issue. It is what happens when a new consumption model scales faster than the systems built to monitor it.

The shift matters because AI is no longer a small experimental line item tucked inside a lab budget. It is moving into production products, internal copilots, retrieval systems, agents, and workflow automation. Once that happens, AI stops behaving like a one-off software purchase and starts behaving like a metered utility with highly variable demand. Finance teams are being asked to forecast, cap, and justify spend in a regime where usage can spike with product adoption, prompt length, model choice, and downstream tool calls.

KPMG’s Steve Chase told the WSJ that this is “a new resource that needs to be managed that didn’t exist quite that way,” and that firms are seeing exponential growth. That language is doing a lot of work. The challenge is not simply that AI is expensive. It is that token-based billing introduces cost behavior that is granular, dynamic, and often only legible after the fact.

How token-based billing changes the math

Token-based pricing makes AI consumption measurable in a technical sense, but not necessarily controllable in a financial one.

Unlike seat-based software or a fixed enterprise contract, token billing ties cost directly to usage volume. That sounds clean until you consider what actually drives tokens: prompt size, system instructions, context windows, retrieval payloads, chain-of-thought suppression, tool execution loops, output length, retries, and model routing decisions. A product team can ship the same feature and see wildly different cost profiles depending on how users behave and how the orchestration layer is configured.

That is why token pricing complicates forecasting. A monthly spend model built for traditional SaaS assumes relatively stable marginal cost. Token-based AI usage can behave more like cloud compute with product-level demand volatility layered on top. If a feature goes viral, if a new workflow causes longer prompts, or if an agent loops through multiple tool calls, the bill can climb before anyone notices.

The Decoder’s summary of the KPMG survey points to exactly that problem: companies are not just lacking cost discipline, they are lacking the telemetry needed to build it. Without near-real-time visibility into token consumption by model, service, tenant, feature, or customer cohort, finance is left reconciling the invoice after deployment decisions have already been made.

That creates a very specific governance problem. If the organization cannot observe marginal cost at the level where product and engineering decisions are made, then the default response is either overrestriction or undercontrol. One slows experimentation. The other invites budget overruns.

The operational risk is not abstract

The main danger of opaque AI spend is not only that the CFO is surprised. It is that product decisions begin to drift away from economic reality.

If teams cannot see which features are consuming the most tokens, they cannot easily determine whether the usage is creating enough value to justify the cost. That leads to several predictable failures:

Mispriced features. AI-enabled capabilities may be bundled into products without a clear unit-economics model, especially if teams treat inference as an infrastructure detail rather than a product cost.
Poor rollout decisions. A team may launch a feature broadly before understanding how usage patterns scale the bill, then have to throttle rollout after the cost curve appears.
Cross-subsidy without consent. One product line can quietly absorb the cost of another if there is no chargeback or allocation logic.
Experimentation friction. When finance cannot trace spend back to specific experiments or deployments, it tends to respond conservatively, slowing the very testing that makes AI systems better.

The WSJ reporting, as relayed by The Decoder, describes companies burning through annual token and cloud budgets within months. KPMG has already seen one client with a sixfold spike in token usage. Whether that becomes an outlier or a pattern depends less on the raw price of inference than on whether companies can connect usage to business outcomes fast enough to intervene.

This is where the comparison to the cloud boom becomes useful, but only up to a point. Cloud cost overruns were often a result of poor tagging, weak visibility, and permissive provisioning. AI adds a second layer of complexity: the cost driver is not just infrastructure consumption but language-model interaction itself. That means the control surface must sit closer to the application layer.

A governance playbook for the token era

Companies do not need to stop experimenting with AI to regain control. They need to instrument it like a production dependency.

A workable governance model usually starts with four moves.

1. Build cost telemetry into the product stack

Track token usage by model, endpoint, feature, tenant, environment, and business unit. If possible, capture the prompt and response metadata needed to understand which workflow produced the cost. Finance cannot govern what engineering cannot attribute.

2. Define unit economics early

Tie AI spend to an operational denominator: per ticket resolved, per document summarized, per search answered, per customer onboarded, or per workflow completed. Raw token totals are useful, but unit economics are what let leadership decide whether the spend is justified.

3. Set budget guardrails and chargeback rules

Use soft limits for experimentation and hard limits for production where the risk justifies it. Chargeback or showback mechanisms help ensure that teams feel the marginal cost of the features they ship. That does not mean every project should be optimized to the penny. It does mean every team should know who owns the bill.

4. Make model choice a governance decision, not just an engineering one

Routing traffic to a smaller or cheaper model, shortening context windows, caching responses, reducing retries, or limiting agent loops can materially change spend. Those are architecture decisions with financial consequences, so they should be visible to product, engineering, and finance together.

Vendors can help here, but only if they expose the right controls. Telemetry, per-request cost estimates, budget alerts, spend caps, and audit logs should be treated as baseline features rather than premium extras. The companies that can show where each token went will have a much easier time defending AI budgets than the ones that only provide a monthly total.

What this means for product rollout

The visibility gap is already influencing how AI products get shipped.

Teams that can monitor usage at a granular level can roll out features in controlled stages, compare model options, and decide whether to subsidize AI functionality as a growth expense or charge for it directly. Teams that cannot see the spend curve are more likely to pause launches, cap usage aggressively, or avoid ambitious deployments altogether.

That has strategic implications. If AI features are expensive but invisible, product managers may overestimate demand because adoption looks good while the cost remains hidden. Or they may underestimate strategic value because finance sees only the bill, not the revenue, retention, or labor-savings effect. Either way, the organization risks making deployment decisions on partial information.

The emerging advantage, then, is not just access to better models. It is the ability to operate them with financial observability.

What vendors will have to fix

The market should expect vendors to compete on visibility as much as on model quality.

If token billing is becoming the default, buyers will increasingly ask for richer usage logs, usage-based forecasts, budget alarms, and tooling that can translate consumption into business units. That may seem like plumbing, but plumbing becomes a differentiator when cost surprises can halt deployment.

KPMG’s survey figures are a warning sign for the vendor ecosystem too. If more than three-quarters of companies do not have full visibility into AI costs, then product vendors, model providers, and cloud platforms have an opportunity — and arguably a responsibility — to make that visibility native. Otherwise, they risk selling into a market where adoption is strong but trust in the billing layer remains weak.

The companies that figure this out first will not necessarily be the ones using the most AI. They will be the ones that can measure it well enough to keep expanding it without losing control of the budget.

Most companies are flying blind on AI spending