Competition among AI vendors is certainly yielding excellent results. OpenAI recently released a version of GPT-5 Codex for working with code, and just the other day, Anthropic updated its Claude Sonnet to version 4.5 to keep up with its competitor.
And it succeeded, because judging by the benchmarks, a new champion now reigns supreme in coding-related tasks. Claude Sonnet is demonstrating excellent results in development, reclaiming its position as the market standard for all IT engineers.
So, what new things have Anthropic come up with:
🔹 Most importantly, in software task tests (SWE-bench Verified), the model demonstrated 77.2% efficiency, with an average performance gain across benchmarks of ~3%. This benchmark is used to determine a neural network’s ability to be an “autonomous engineer,” to the extent that AI is even capable of such a thing.
🔹 A strong emphasis on security: Sonnet 4.5 is less prone to hallucinations, “flattery,” and the generation of dangerous content. Furthermore, Sonnet is primarily used for code generation, so the reduction in hallucinations reduces headaches and debugging for vibe coders.
🔹 In tests, it “broke” a record: creating an application of Slack complexity took ~30 consecutive hours. This again indicates a significantly increased degree of coherence in generation. Add to this the fact that Sonnet hallucinates less, and the resulting code quality has significantly improved.
Who will benefit from this and why:
💰 SUDDENLY for anyone working with finance. This particular model showed excellent results in analyzing financial documents. Just remember that the conclusions of any neural network need to be double-checked.
👨💻 Developers: for automatically generating application parts, refactoring, and writing new modules.
🏗 Architects/technical leaders: for system design, generating APIs and database schemas.
⚡️ DevOps/infrastructure engineers: for generating scripts, setting up CI/CD, and Terraform/Ansible templates.
🚀 Startup teams/MVPs: to speed up prototyping and automate parts of development. Again, Claude has remained a top-tier solution for prototyping since version 3.7. The new version is also excellent.
📚 Researchers/analysts: for processing large texts, documenting, and conducting code reviews and audits.
When to expect it and on what tariffs is it available
Sonnet 4.5 is available in two versions: the basic version, available starting with the Standard plan, and Sonnet 4.5 High, which features an expanded context window and is available starting with the Premium plan.
We’ve attached benchmark charts to the post, where you can see the model’s performance compared to its competitors.
You can try as usual
в чат-боте @chat_ai_tg_bot
и на сайте chataibot.pro