Search overlay panel for performing site-wide searches

Boost Performance & Scale with Postgres Advanced. Join Pilot Now!

Whats New in Heroku AI: New Models and a Flexible Standard Plan

Heroku is introducing significant updates to Managed Inference and Agents. These changes focus on reducing developer friction, expanding model catalogue, and streamlining deployment workflows.

More flexibility with the new standard plan

Until now, Heroku’s model-based plans required developers to provision a specific add-on for a specific model. This created significant operational overhead. If you wanted to experiment with a different model or implement a fallback strategy, you had to provision a new add-on and manage multiple config variables.

We have added a new standard plan for Heroku Managed Inference and Agents.

With this update, a single add-on and a single API key grant access to our entire catalog of supported models. You no longer need to reprovision resources to switch from a smaller model to a high-reasoning model. Instead, you simply update the model name in your code. This unified approach improves developer experience and allows for more robust application architectures. Try the standard mode using the following CLI command:

$ heroku addons:create heroku-inference:standard -a $APPNAME

New frontier models and an expanded open-weight catalog

Claude 4.6 models

Heroku now supports the Claude 4.6 family, the most capable models in the Claude family, designed for high-complexity workloads.

  • Claude Opus 4.6: Designed for advanced software development, complex agentic workflows, and long-horizon planning.
  • Claude Sonnet 4.6: High-performing model that is ideal as a daily driver and sophisticated financial analysis.

Open-weight models

We have also expanded our catalog with five new open-weight models to provide more cost-effective options for diverse use cases.

  • DeepSeek v3.2: Advanced model built for high-efficiency agentic reasoning and long-context understanding.
  • Kimi K2.5: Optimized for massive context processing, advanced mathematical reasoning, and complex agent swarms.
  • MiniMax M2.1: Specialized for practical engineering and multi-language full-stack application building.
  • ZAI GLM 4.7: Industry-leading model for reliable tool-calling and vibe coding visually aesthetic front-ends.
  • ZAI GLM 4.7 Flash: A lightweight model optimized for speed, cost-efficiency, and agentic workflows where responsiveness is critical.

Embed models

We are enhancing our support for vector-based search and retrieval with a new Cohere Embed V4 model. The latest generation of Cohere’s embedding technology is built for higher accuracy and complex document analysis.

  • Cohere Embed V4: Specifically designed to understand conceptual relationships rather than just keyword matching.

Model deprecation notice

As we transition to these next-generation models, we are beginning the deprecation process for older versions, including Claude 3.5, Claude 3.7, and Claude 4. Users are encouraged to migrate to Claude 4.5 and 4.6 to ensure continued support and optimal performance.

Build better with Heroku AI

The shift to a standard plan and the addition of new frontier models like Claude Opus 4.6 represent Heroku’s commitment to providing access to a wide model catalogue. By improving developer experience and expanding model choice, we are making it easier than ever to build, scale, and optimize AI-powered applications.

To get started, visit the Heroku Dev Center or provision the new standard plan for Heroku Managed Inference and Agents today.

Ready to Get Started?

Stay focused on building great data-driven applications and let Heroku tackle the rest.

Talk to A Heroku Rep   Sign Up Now

More from the Authors
Heroku AI - PMTS at Heroku
Heroku Staff

Browse the archives for News or all blogs. Subscribe to the RSS feed for News or all blogs.