Ground Control to Major TOML: What Is a TOML File and Why Do Buildpacks Use TOML?
- Last Updated: June 03, 2024
YAML files dominate configuration in the cloud-native ecosystem. They’re used by Kubernetes, Helm, Tekton, and many other projects to define custom configurations and workflows. But YAML has its oddities, which is why the Cloud Native Buildpacks project chose TOML as its primary configuration format.
What is TOML?
TOML stands for Tom’s Obvious, Minimal Language. It’s a configuration file format created to be simple, readable, and predictable. Designed for humans first and machines second, TOML avoids syntactic ambiguity and is easy to understand even at a glance.
The format uses a key-value structure and supports sections, arrays, and nested tables. Its goal is to be as clear and minimal as possible while still expressive enough for most configuration tasks. Unlike JSON, TOML allows comments. Unlike YAML, it doesn’t rely on indentation for structure, which reduces the likelihood of formatting errors.
TOML has gained adoption in modern developer tools. The Rust package manager Cargo and the Python dependency tool Poetry both use TOML. In the cloud-native ecosystem, containers and Cloud Native Buildpacks also rely on TOML for configuration.
What is a TOML file?
A TOML file is a plain text file with a .toml extension that stores structured configuration data using TOML syntax. It defines settings using simple key-value pairs, and organizes related settings using headers and tables.You can learn more about TOML from the official documentation, but a simple buildpack TOML file looks like this:
api = "0.2"
[buildpack]
id = "heroku/maven"
version = "1.0"
name = "Maven"
Key features of TOML files include:
- Easy-to-read syntax with no reliance on whitespace or indentation
- Support for comments using the # symbol
- Clear data typing with support for strings, integers, floats, booleans, arrays, and date-times
- Append-friendly design that allows tools to modify files without full parsing
Heroku Buildpacks use TOML because it offers a low-friction configuration experience. Most developers working with Buildpacks won’t need to write a TOML file from scratch, but when configuration is necessary, TOML offers clarity without complexity.
It’s also easy for machines to read and write; you can even append to a TOML file without reading it first, which makes it a great data interchange format. But data interchange and machine readability aren’t the main drivers for using TOML in the Buildpacks project; it’s humans.

TOML vs. YAML
TOML and YAML are both designed for configuration, but they take different approaches. TOML emphasizes simplicity and unambiguous syntax. YAML offers more complex features, which can be useful in some situations but come with trade-offs.
Reasons developers choose TOML over YAML:
- Easier to read: TOML uses a straightforward key-value format. YAML relies heavily on indentation, which can lead to formatting errors.
- Less error-prone: Mistakes in whitespace or formatting are less likely to cause parsing failures in TOML.
- Designed for simplicity: TOML avoids advanced features like anchors, aliases, and custom tags found in YAML.
- Clearer typing: TOML uses consistent syntax for strings, numbers, booleans, and arrays. YAML supports multiple representations for the same data types, which can lead to inconsistencies.
While YAML is widely used in infrastructure tools like Kubernetes and Helm, TOML is better suited for lightweight configuration, such as build tools or dependency managers.
TOML vs. JSON
JSON is the most widely used data format for APIs and machine-to-machine communication. However, it’s not ideal for configuration written by humans. TOML fills that gap by being easier to write, read, and maintain.
Here are some of the reasons TOML is a better choice for configuration files that may be read, written, and edited by developers.
- TOML supports comments, allowing developers to annotate configurations. JSON does not.
- TOML is less verbose. It avoids the need for extensive quoting and nested braces.
- TOML is more readable for multi-level configuration, thanks to its table and header syntax.
- TOML is append-friendly, making it easier for tools to update files without reformatting them.
JSON remains a strong choice for structured data interchange between systems, but TOML is better for human-authored configuration. This is why the Buildpacks project uses TOML for developer-facing config and reserves JSON for machine-readable formats like image metadata.
Put your helmet on
The first time you use Buildpacks, you probably won’t need to write a TOML file. Buildpacks are designed to get out of your way, and disappear into the details. That’s why there’s no need for large configuration files like a Helm values.yaml or a Kubernetes pod configuration.
Buildpacks favor convention over configuration, and therefore don’t require complex customizations to tweak the inner workings of its tooling. Instead, Buildpacks detect what to do based on the contents of an application, which means configuration is usually limited to simple properties that are defined by a human.
Buildpacks also favor infrastructure as imperative code (rather than declarative). Buildpacks themselves are functions that run against an application, and are best implemented in higher level languages, which can use libraries and testing.
All of these properties lend to a simple configuration format and schema that doesn’t define complex structures. But that doesn’t mean the decision to use TOML was simple.
Can you hear me, Major TOML?
There are many other formats the Buildpacks project could have used besides YAML or TOML, and the Buildpacks core team considered all of these in the early days of the project.
JSON has simple syntax and semantics that are great for data interchange, but it doesn’t make a great human-readable format; in part because it doesn’t allow for comments. Buildpacks use JSON for machine readable config, like the OCI image metadata. But it shouldn’t be used for anything a human writes.
XML has incredibly powerful properties including schema validation, transformation tools, and rich semantics. It’s great for markup (like HTML) but it’s much too heavy of a format for what Buildpacks require.
In the end, the Buildpacks project was comfortable choosing TOML because there was solid prior art (even though the format is somewhat obscure). In the cloud native ecosystem, the containerd project uses TOML. Additionally, many language ecosystem tools like Cargo (for Rust) and Poetry (for Python) use TOML to configure application dependencies.
Commencing countdown, engines on
The main disadvantage of TOML is its ubiquity. Tools that parse and query TOML files (something comparable to jq) aren’t readily available, and the format can still be jarring to new users even though it’s fairly simple.
Every trend has to start somewhere, and the Cloud Native Buildpacks project is happy to be one of the projects stepping through the door.
- Originally Published:
- BuildpacksDeveloper Tools