Anthropic launches Claude Opus 4.5, the first model that attacks real problems of LLM agents in production: context saturation, infinite ping-pong, fragile JSON and high costs.
Anthropic has just launched Claude Opus 4.5, and for the first time in a while I see a model that not only promises to be smarter, but attacks real problems we suffer when trying to put LLMs in production. Especially in agentic systems.
The real problem: Current agents are inefficient
If you’ve tried to build something more complex than a chatbot, you’ll know that LLM agents have serious problems:
1. Context saturation: You pass the model documentation of 50 available tools (GitHub, Notion, Slack, JIRA…) and half the context goes to listing APIs it might not even use.
2. Infinite ping-pong: The model calls a tool, waits for response, calls the model again with the result, calls another tool… In long tasks this is brutal in latency and cost.
3. Fragile JSON: Model-app communication via tool calling is error-prone. A malformed JSON and your workflow crashes.
4. Sky-high cost: Powerful models spend tokens like there’s no tomorrow. A development task can easily consume hundreds of thousands of tokens.
Opus 4.5 attacks these four problems head-on.
Tool Search Tool: Load tools on demand
Instead of putting all tools in the initial context, Opus 4.5 introduces Tool Search Tool.
The model can search and load tool documentation only when needed. Working with a GitHub repo? Load the GitHub API. Need to create a JIRA ticket? Load JIRA at that moment.
Why this matters
Fewer wasted tokens, cleaner and more focused context, and scalability: you can have hundreds of tools available without saturating the prompt.
Efficiency: 65% fewer tokens
Anthropic says Opus 4.5 uses up to 65% fewer tokens to complete the same task as its previous models.
This dramatically changes the cost equation for production agents.
What do you think?
Have you experimented with agents in production? What have been your main pain points?
