Engineering

Slow Database Queries? Are Database Types to Blame?

Engineering
Last Updated: December 19, 2019
Ben Fritsch

This blog post is adapted from a lightning talk by Ben Fritsch at Ruby on Ice 2019.

There can be a number of reasons why your application performs poorly, but perhaps none are as challenging as issues stemming from your database. If your database’s response times tend to be high, it can cause a strain on your network and your users’ patience. The usual culprit for a slow database is an inefficient query being executed somewhere in your application logic. Usually, you can implement a fix in a number of common ways, by:

reducing the amount of open locks (or …

The Curious Case of the Table-Locking UPDATE Query

Engineering
Last Updated: June 03, 2024
Richard Schneeman

Update: On closer inspection, the lock type was not on the table, but on a tuple. For more information on this locking mechanism see the internal Postgresql tuple locking documentation. Postgres does not have lock promotion as suggested in the debugging section of this post.

I maintain an internal-facing service at Heroku that does metadata processing. It’s not real-time, so there’s plenty of slack for when things go wrong. Recently I discovered a Postgres performance issue that bogged down the system to the point where no jobs were being executed at all. After hours of debugging, I found that …

Let It Crash: Best Practices for Handling Node.js Errors on Shutdown

Engineering
Last Updated: December 18, 2019
Julián Duque

This blog post is adapted from a talk given by Julián Duque at NodeConf EU 2019 titled "Let it crash!."

Before coming to Heroku, I did some consulting work as a Node.js solutions architect. My job was to visit various companies and make sure that they were successful in designing production-ready Node applications. Unfortunately, I witnessed many different problems when it came to error handling, especially on process shutdown. When an error occurred, there was often not enough visibility on why it happened, a lack of logging details, and bouts of downtime as applications attempted to recover …

Static Typing in Ruby with a Side of Sorbet

Engineering
Last Updated: October 31, 2019
Jason Draper

As an experiment to see how static typing could help improve our team’s Ruby experience, we introduced Sorbet into a greenfield codebase with a team of 4 developers. Our theory was that adding static type checking through Sorbet could help us catch bugs before they go into production, make refactoring easier, and improve the design of our code. The short answer is that yes, it did all of that! Read on to learn a little more about what it was like to build in a type safe Ruby.

The Sorbet project’s logo

Static typing vs dynamic typing

Ruby is a …

Automated Continuous Deployment at Heroku

Engineering
Last Updated: April 29, 2024
Bernerd Schaefer

Over the past four years, the Heroku Runtime team has transitioned from occasional, manual deployments to continuous, automated deployments. Changes are now rolled out globally within a few hours of merging any change—without any human intervention. It's been an overwhelmingly positive experience for us. This post describes why we decided to make the change, how we did it, and what we learned along the way.

Where We Started

Heroku’s Runtime team builds and operates Heroku’s Private Space (single-tenant) and Common Runtime (multi-tenant) platforms, from container orchestration to routing and logging. When I joined the team in 2016, the process for …

Up to 75% Faster Maintenances with Heroku Postgres and Redis Premium Plans

Engineering
Last Updated: August 28, 2019
Corey Purcell

As outlined in a previous blog post, Heroku Data services undergo routine maintenances for security and patching. In this post, we describe the process used to minimize downtime for Heroku Postgres and Heroku Redis premium ‘High Availability’ plans and how we optimized the process to perform up to 75% faster.

Data Services Architecture

High availability plans for Postgres and Redis are designed to have two database instances running at the same time. One is a writeable primary database server and the other is a read-only hidden standby. Since the standby is hidden, customers cannot access it during normal operations.

…

Designing for Accessibility: Contrast Ratio

Engineering
Last Updated: May 16, 2024
Ariana Escobar, Jamie White

This is the second post in a two-part series about accessibility. The first post shares why designing for accessibility is important to us and why we encourage you to incorporate it into your software design process.

Heroku’s first accessibility initiative was to reach Level AA for luminance contrast ratio as defined by the internationally recognized best practices of the Web Content Accessibility Guidelines (WCAG) 2.0. This ratio guarantees the legibility of text against its background, in order to ensure all users can perceive Heroku’s user interfaces equally.

This benefits people with color-vision deficiencies (like Deuteranopia or Protanopia which affect …

Dataclips Power Insights at Heroku

Engineering
Last Updated: June 03, 2024
Becky Jaimes

Every organization needs to be data-driven in order to be successful. Whether you're tracking an application's performance, incoming support tickets, or revenue rates, different components of any company depend on metrics that inform the health of the business.

At Heroku, we're hackers to the core, but that doesn't mean we're all programmers. We build on top of our own platform for everything we do, and one of the products we often use is Heroku Dataclips. If you haven't heard of them before, Heroku Dataclips allow you to create SQL queries in a web GUI that run on your Heroku Postgres …

Puma 4: Hammering Out H13s—A Debugging Story

Engineering
Last Updated: July 12, 2019
Richard Schneeman

For quite some time we've received reports from our larger customers about a mysterious H13 – Connection closed error showing up for Ruby applications. Curiously it only ever happened around the time they were deploying or scaling their dynos. Even more peculiar, it only happened to relatively high scale applications. We couldn't reproduce the behavior on an example app. This is a story about distributed coordination, the TCP API, and how we debugged and fixed a bug in Puma that only shows up at scale.

Connection closed

First of all, what even is an H13 error? From our error page …

Subscribe to the full-text RSS feed for Engineering.

World Economic Forum Powers the Davos Attendee Experience with Heroku

Building AI Search on Heroku