Eighteen months ago, “use an open-weight model” was a sentence you said mostly to make an ideological point. With Llama 3 70B and Mistral Large now genuinely competitive with the previous generation of frontier models on the tasks most businesses care about, the argument has quietly shifted from ethics to economics. That makes it worth revisiting from a practitioner’s chair.
Nothing we’re about to say is a blanket recommendation. Closed-weight frontier models from Anthropic, OpenAI and Google still win on the hardest long-reasoning tasks, on tool-use benchmarks, and on the frontier cognitive work that most enterprise use cases don’t actually need. The case for open weights isn’t that they’re uniformly better — it’s that the quality gap has narrowed enough that the other trade-offs start to matter.

Five situations where open-weight models are now the pragmatic choice:
- Regulated industries with data-residency obligations. Financial services, healthcare and public-sector customers increasingly need a deployment story where no prompt leaves their tenant. Self-hosted Llama or Mistral on a VPC is the cleanest answer your legal team will read.
- High-volume, predictable workloads. At a certain number of tokens per month, the unit economics flip. A well-utilised H100 running an open model quietly undercuts a per-token API invoice — sometimes by an order of magnitude — as long as you have the ops maturity to run the server without drama.
- Domain-specific fine-tuning. If your task is narrow enough (medical coding, legal contract classification, a specific schema extraction), a 7B or 13B model fine-tuned on your data often beats a generalist frontier model and runs on modest hardware. This is not a research opinion; it’s what we’ve shipped.
- Edge and latency-critical paths. Frontier APIs live in a handful of regions. If your users are in Jakarta or Johannesburg, a right-sized open model in-region will beat the round-trip latency of any closed API — sometimes by several hundred milliseconds that matter.
- Vendor-risk-averse buyers. Procurement teams at larger enterprises are increasingly uncomfortable depending on a single model provider for a core capability. Even if you run closed models today, having an in-house open-weight fallback path in your architecture is table stakes for an enterprise sales conversation this year.

None of this is a push to replace your current stack. Most teams should still start with a closed frontier API — it’s the fastest path to a working product, and the economics only start to matter after the feature has earned the right to exist. But the serious architecture conversation now includes an open-weight deployment path, and the teams ignoring that conversation are quietly accumulating risk that becomes expensive to unwind later.
