Business · 2h ago
Coinbase Halves AI Costs by Routing to Cheaper Models, No Engineer Caps
Coinbase cut AI spend 50% by defaulting engineers to open-weight models GLM 5.2 and Kimi 2.7, improving cache hit rate from 5% to 60%, and routing tasks by complexity. CEO Brian Armstrong shared five levers, noting 91% of engineers never hit old usage limits. The shift signals enterprise budget pressure driving adoption of cheaper models, threatening OpenAI and Anthropic revenue.
Meridian48 take
The real story isn't cost-cutting but the enterprise shift to open-weight models—a trend that could reshape the AI market if quality holds.
Read the full reporting
Coinbase Cut Its AI Spend in Half Without Throttling Engineers - Here's the Playbook →
DEV Community
ai-cost-optimizationopen-weight-models