31. Building Docker images with Cloud Native Buildpacks
Hosted by Joe Kutner, with guests Stephen Levine, Emily Casey, Ben Hale, and Terence Lee.
The adoption of containers as a technique to build and deploy applications has made container images the new executable standard of the cloud. But maintenance of one's Dockerfile is a serious shortcoming to this methodology. Adhering to best practices and patching security issues can be difficult to stay on top of. Cloud Native Buildpacks aim to resolve these issues by providing a simpler abstraction for building apps, often without any explicit configuration necessary. Developers from Heroku and Pivotal who have built and used CNBs discuss these benefits and more.
Joe Kutner, an engineer at Heroku, leads the discussion on Cloud Native Buildpacks with Stephen Levine (engineer at Pivotal), Emily Casey (engineer at Pivotal), Ben Hale (steward of the Java Buildpack), and Terence Lee (engineer at Heroku). All of them are involved in overseeing the CNB project, and have used the technology in production at their companies. At its core, CNBs are an OCI-compliant alternative to Dockerfiles, except the container is built without very much developer interaction. By analyzing source code, CNB is able to determine the base image to start from, as well as which steps to undertake in order to ensure that an application runs correctly. It's a similar logical process to the way in which Heroku's regular buildpacks operate: one Git push command is all that Heroku needs to generate a slug of your application, including fetching any dependencies or managing assets.
A Buildpack provides you with consistency by keeping your dependencies and base image up-to-date to the latest standards. It also doesn't require the average application developer to also be an expert on everything. Rather than needing to understand all of the Unix-y commands necessary to compose an application, or knowing when a particular component has a CVE, a developer's chief concern concern is to continue building their app as usual, and CNB will handle all of the other components necessary to make a deployable container.
The aim for CNBs was to make development easier for individual developers, as well as those at large enterprises. Terrance and Ben in particular know that the value of the buildpacks come from five years of production experience. Regular applications have benefitted from buildpacks, and the ecosystem around those had grown to the point where they could take what they'd learned and apply them to container images as well.
The episode concludes with a look towards the future of CNBs. As all of the APIs and tooling are open source, groups can also design CNBs for any language and framework they wish to containerize. The project recently entered beta, and, despite the positive reaction, there are still some changes that need to be made with regards to how changes to buildpacks are applied. There are processes for community members to be involved, and multiple forums for communicating with the CNB leadership team.
Links from this episode
Joe: Welcome. My name is Joe Kutner. I'm an engineer at Heroku. Today, we're going to talk about Cloud Native Buildpacks. With me, we have some special guests. I have Terence Lee who's also in engineering at Heroku. We also have some folks from Pivotal who work with us on the Cloud Native Buildpacks core team. I'm going to turn it over to them to introduce themselves in a moment, and then we'll talk about what Buildpacks are and why we created this project and what it can do for you. Stephen?
Stephen: Hi, I'm Stephen Levine. I run the build program at Pivotal. I'm a software engineer and a product manager.
Emily: My name is Emily Casey. I'm an engineer at Pivotal and I lead the engineering effort on Cloud Native Buildpacks at Pivotal.
Ben: Hi, my name is Ben Hale. I've run Java Strategy for Cloud Foundry and for Pivotal. I'm in charge of the Java Buildpack in the Cloud Foundry ecosystem.
Terence: Hi, I'm Terence Lee. Like Joe mentioned, I work at Heroku and probably my only claim to fame is that I helped co-create the original Buildpacks API.
Stephen: You can think of Buildpacks like an alternative to Dockerfiles. Where it's sort of a container native build process that generates an OCI image but unlike Dockerfiles, it doesn't require very much developer interaction. As a developer, you have your source code and you use a Cloud Native Buildpacks to turn that source code into an OCI image automatically and you can then deploy that through different environments or do whatever you want with it.
Stephen: We're sort of not focused on the deployment aspect of building these images. Cloud Native Buildpacks also provides kind of a nice operator interface for creating build configurations so because a Buildpack can be more modular with Cloud Native Buildpacks, you can sort of create a build configuration that matches your platform or your sort of user's needs if you operate a platform that a lot of developers use. Compared to the old sort of Buildpack APIs that Cloud Foundry and Heroku have had, we're moving to container standards, we build OCI images, kind of uniquely we don't require a Docker image to do builds at all so we can build in the cloud in unprivileged containers and we can make reproducible container images because we sort of build container images manually.
Joe: As a developer or engineer who wants to create a Docker image for my application and I'll open this up to everybody, what are like the number one or number two reasons that I would use Buildpacks over Dockerfile or some other alternative?
Ben: I think one of the big reasons you might want to use this instead of a Dockerfile is there's quite a lot of boilerplate in a Dockerfile and not all applications really are serviced by needing to go through that Boilerplate every time, even in cases where you're using base images and stuff like that, simply having the file around, making sure that your application is installed, all the operating system packages are kept up to date underneath it can be a real large amount of overhead for an application that is simply sort of a 12-factor application.
Ben: A Buildpack gives you consistency, updateability, and sort of takes those responsibilities away from your average application developer, which is not to say that all applications would benefit from something like this. There are a number that can use Dockerfiles for things that Buildpacks are not particularly well suited for, but we think that there is a large enough number of those kinds of applications that we can improve developer velocity by bringing Buildpacks to the table.
Terence: We hear this internally at Salesforce a bunch and also when talking to kind of the broader communities that people don't actually want to be an expert on everything, and I feel like with the way a lot of things have been going you feel like you have to not only understand how to build your app, but now like you have to be an expert in doing stuff with Docker and containers and kind of all these things when at the end of the day, you just want to … for a lot of people, they just want to deliver features that are valuable to their customers with the end product.
Emily: To me, the biggest advantage is that Buildpacks will help you keep your dependencies and your base image up to date. Creating a Dockerfile accurately the first time is a hurdle most people can clear, but I think very few people regularly go back to their Dockerfile and update their dependencies as often as they should when CVEs come out so knowing that you have a build system that can be pulling in new dependencies for you is a major win for security.
Stephen: Yeah. I think a lot of the Buildpack model's architected around the idea that we want to create a mechanism that can deliver secure dependencies to applications. We have the ability to rebuild images and sort of using layers from images that were built before in a really efficient way. But we also have the ability to swap out the base image for an image that already exist that might be running in production without changing the application layer sort of just swapping out the API compatible part.
Stephen: There's some sort of unique advantages to the Buildpack model around security that you don't get with the Dockerfile model just based on sort of how the layers are laid out and how the technology works underneath.
Joe: I'm hearing things like security and auditability and things that I would lump into trust or something like that, which are usually concerns related to an enterprise or organizations that have very stringent concerns with compliance. But I'm also hearing about developer velocity and abstractions that allow developers to handle dependencies better. Is Cloud Native Buildpacks for big enterprises or is it for small individual developers or is it both?
Stephen: I think it's very much both. I think at Pivotal, the perspective we're taking with this sort of collaboration is that we want to use this for a lot of enterprise use cases, but at the same time, we want to make sure that developers at those enterprises are very happy with the sort of experience we provide too. I'm sure that you guys at Heroku have your own perspectives.
Emily: If you have a tool that makes it so developers can like easily satisfy some of these enterprise constraints just because they're using this tool that like handles auditability and security patching by default, then they can have like a more delightful developer experience because the tool itself satisfies the operator constraints.
Ben: Yeah. I think you'd run into a situation in enterprises especially or high performing developer groups where there's effectively two situations that need to be taken care of. One is I want a good developer experience, but oftentimes a good developer experience runs afoul of various enterprise requirements, and then vice versa. Enterprises want very strict restraints and constraints on things and that runs afoul of the developer experience. The Cloud Native Buildpacks project from its very inception says that there are two players in this kind of interaction but always takes into account the idea that you need to satisfy both of them from modern software development today.
Joe: How is a Docker image that's generated from a Buildpack different or the same as one that's generated from a Dockerfile?
Stephen: I think there's two aspects to that. One is the sort of layers that end up in the final image. There's more auditability about where those dependencies that live in those layers come from because the Buildpack can know exactly what it's installing. It can write that meditate on the image. But I think another aspect is that those layers are sort of contractually separate from each other and from the operating system layers. That means that we can swap sort of any of those parts out individually without rebuilding the whole image when it's safe.
Stephen: So, just the way the image is structured looks very different than how an image would be structured that's built using a Dockerfile. We get benefits from that sort of change in the structure.
Joe: Would you say it's more transparent compared to what's generated with the Dockerfile?
Stephen: Yeah, the final artifacts are easy to get metadata about what's in the final artifact, to look inside of it and see exactly what was installed. There are definitely benefits around that.
Terence: Yeah. I guess like there's structure and standardization that we put on image, which is what Stephen was talking about, whereas a Dockerfiles having a free for all of like whatever you happen to write in it turns into an image and that's what you get at the end of the day. We get to take advantage of kind of that structure and standardization to give you these things into the project.
Stephen: Yeah. I think that transparency extends up into the Buildpacks also. So, with Cloud Native Buildpacks, we really expand it out on that multi-Buildpack functionality that Cloud Foundry and Heroku have right now. If you want to, you can make sort of really modular transparent pieces. It's kind of up to you, the granularity of what you provide in that build process itself too for the Buildpack you select and/or create.
Joe: How do these Docker images that are generated from Buildpacks inter-operate or work with the rest of the container ecosystem? I guess specifically I'm thinking about things like Knative, how does it relate to things like Jib, those types of things?
Stephen: The Cloud Native Buildpacks build an OCI image at the end, no matter where you use Cloud Native Buildpack, you always get that same output and you can deploy that image wherever you like. It could be k8s, that could be Knative, that could be Cloud Foundry, Heroku, and I know you guys support Docker there too, wherever you want.
Stephen: Just to clarify. An OCI image is the same thing as a Docker image. It's just sort of the name of the new standard for Docker images.
Ben: I think one of the key takeaways is that the specification for Buildpacks in addition to your sort of defining how exactly applications are built in the environment that they're built-in. Also, it defines quite a lot of information about how they're run and includes components in the end image to guarantee sort of a standardized running environment. That's what gives us this stability to say, "Okay, I'm going to build this image somewhere and it doesn't matter particularly where that is, and then I expect it to run in a very consistent way on all of these different platforms that support OCI images. We want this sort of uniformity, this ability to have portability of the end artifact to run in a bunch of different environments and to run the same way in each one of those environments."
Stephen: The Cloud Native Buildpacks project is two really major software components that we ship. One is the pack CLI, which is really just a CLI for, it's like the first thing you see if you kind of went to Buildpacks.io and started playing with the project. It's really just a CLI for local workstation use. We need Linux containers to do builds of OCI images especially for languages that are Java or Go where it's hard to cross-compile or even cross package things.
Stephen: We need Linux containers, and so the pack CLI kind of requires a Docker Daemon to run locally, even if you're on Linux but this means that it can run on Linux or Mac or Windows. We really don't intend people to use that kind of like platform implementation. That's the pack CLI instead, sorry, when they're building on the cloud, instead we have a separate component we publish called the lifecycle, which is just a set of binaries that run inside of the container that kind of run the whole Cloud Native Buildpacks process.
Stephen: The lifecycle you can use to do Cloud Native Buildpacks on Knative build are just now called Tekton sort of or on Concourse, on any platform that can run containers. That's really the direction we encourage people to go for building stuff in the cloud. We have seen some folks use the pack CLI and CIS systems because it's pretty convenient and so we're thinking about things there right now too. We also have a formal language specification for what the lifecycle sort of does. You could think of the project as consisting of three domain objects platform, which is like a Cloud Foundry or Heroku or k8s or candidate build or Concourse, something that's going to do builds.
Stephen: The lifecycle, which is like a translation layer between the platform and the last thing, which is a Buildpack or set of Buildpacks that compile applications, and so we've kind of formally defined or we have formal language that defines the, how the lifecycle interacts with the platform and how it interacts with the Buildpack so it can provide a consistent interface to Buildpack authors and create a big community around there, but also so that platforms can easily adopt the lifecycle and sort of the Cloud Native Buildpacks process for building applications.
Joe: Those components are all part of the Buildpacks.io or Cloud Native Buildpacks project, where are the Buildpacks?
Ben: The Buildpacks themselves are actually owned by individual companies. Heroku has a set of Buildpacks, Pivotal has a set of Buildpacks, and you can create your own Buildpack. A number of these have been from both Pivotal and Heroku have been open-sourced and are available just through the pack CLI or you can go download them from GitHub repositories and use them yourselves. We expect that to commercially, certainly, on the Pivotal side, will also have additional Buildpacks that go to Pivotal customers.
Ben: But, again, one of the key things about an open specification like this is that you can write your own internally for your enterprise's use. You can go to someplace on the internet and find one for a language that we haven’t supported or something like that and include them wherever your build lifecycle is.
Terence: Yeah. I think one of the kind of goals of having a centralized project-wide Cloud Native Buildpacks was that originally the ecosystems of Buildpacks, though not explicitly separate, were not really a singular ecosystem. Like people that were running Buildpacks for Cloud Foundry generally would not run them on Heroku and vice versa. One of the kind of goals of this project is if like say a vendor company like New Relic writes an APM Buildpack that it could run on basically any platform that kind of abides to the spec that Stephen was talking about, and so it all performs the same across any platform.
Emily: One of the other goals of the product is to create a Buildpack API that encourages more modular Buildpacks. In the past, when users wanted to customize a Buildpack for their specific use case, they would generally take a large Buildpack and fork it to modify it, and then maintain that fork over time. But with the new Buildpack API, it should be a lot easier to replace the specific module you want to change with your customization.
Joe: It sounds like the project has some values like security, transparency, interoperating with the container ecosystem, improving developer velocity and having a good developer experience. Are those problems that you set out to solve at the beginning or I guess how did the Buildpack project start or what was its impetus?
Ben: Yeah. Certainly, and Terence can talk to this in a little bit more detail. We started out with sort of this Buildpack project historically and both Heroku and the Cloud Foundry project both used Buildpacks. We had our own separate ones that were, as described earlier that were very much separate from one another. But as time went on, we definitely noticed themes about these, that there wasn't enough transparency, that we couldn't interrupt with one another, things like that.
Ben: I think that's the biggest input into how the Cloud Native Buildpacks project works today. When you see those values, almost all of them come out of five-plus years of actual real experience with customers. We know that forking Buildpacks to make changes, to make contributions wasn't enough. We know that there were a number of people who felt that even though there might be some log output in a build without much greater transparency, they were resistant to it, because even though it was giving them value, unless they could account for how that value got there, they were suspicious.
Ben: I think many of the themes that we've been talking about so far aren't just because we think these are great ideas, and certainly, they all are. It's because we actually had customers come to us and tell us, this is what I want to see out of Buildpacks and the Cloud Native Buildpacks was that sort of inflection point where we could take a look and say, "Hey, this is critical. We want to include this as part of the specification." The sort of the buy in to using Cloud Native Buildpacks guarantees the set of things for you regardless of the platform.
Terence: Yeah. I mean, just we want it to have a certification I guess to start since the original Buildpacks API was not very much of a specification and a lot of it was not formalized. I mean, a lot of this came out of back … Buildpacks came out of Heroku when we wanted to basically build a polyglot platform, and we basically extracted out kind of the Ruby-specific components away from the rest of the build service and kind of extrapolate that into an API and like a lot of things when you're a startup and doing things really early, you kind of just do the minimal thing and we ended up with a very simple API that from the two companies kind of came together to talk about a new spec.
Terence: We realize even the current spec that we have in production today were different and we kind of diverged on what we thought was even in the current Buildpack API.
Stephen: Yeah. I think when we first started talking about potentially collaborating on a solution here, we realized that we really had a lot of the same problems we were facing, right? That we want to provide developers with the experience that doesn't require a lot of expertise and that results in an application that's running in production in the end that has secure dependencies and they don't have to sort of manage themselves as those dependencies go.
Stephen: But, also, we had a lot of similar infrastructure concerns about updating operating system dependencies on our platforms. We had a lot of similar concerns about portability in the future for when you're using Buildpacks to build an image and you want to run that image somewhere that's not Heroku or Cloud Foundry, we … like having both of us, we all want to make sure that you have that option of bringing your app wherever it is that it needs to run.
Joe: Yeah. I think there's a lot of alignment between the interests of the two parties, but at the end of the day, these are two different parties with … I mean, actually, different companies with different business models. Has that been difficult to sort out or do you think that the values are ultimately the same? Are there differences that help?
Ben: Yeah. I don't think it's actually been that hard for us to work between the two teams. In fact, I think the sort of different target audiences and different customer bases that Heroku and Pivotal bring to the table have been just absolutely complimentary. There are so many decisions that on the Cloud Foundry and Pivotal side we would have made that would have resulted in a worse outcome. While there is certainly a lot of overlap in the middle, the fact that we care about these two different use cases has led to a much, much stronger specification than either of us would've gone solo on our own.
Joe: The project had a beta release a few months ago in April, what was in that release and what does it signal to the community?
Emily: The beta release of the pack CLI and the lifecycle represents a period of maturity where we feel like the product is ready for Buildpack authors and potential platforms that would like to integrate with Cloud Native Buildpacks to start trying out our tools. Obviously, till we hit our first major release, there might still be breaking changes, but we're at a level of stability that we feel confident with people beginning to integrate and give us feedback on the project.
Terence: Even in addition to just Buildpacks authors and platform developers, I also think it's open to basically general consumption from just normal people who might be interested, who may be some of the values that we talked about for like I don't want to write a Dockerfile because I want to have that developer velocity. I think it's worthwhile to check out for those people as well.
Terence: I mean, since April, we've also had a kind of second set of releases with a bunch more fixes so it's definitely not just like this release we had in April. It's an active project that we're continuing to work on and part of announcing the beta is a captured feedback I think from the broader community and people that are actually using it kind of outside of our two companies, and being able to kind of incorporate that feedback and those changes into the project itself.
Joe: Okay. What's ahead for Buildpacks in the next month or two and maybe in the next year?
Stephen: I think we have a lot of breaking changes coming up. As the project has kind of gone along, we've had cycles of feedback, releasing something out there with APIs that are kind of informed by what our APIs have looked like in the past. But also problems we can get to solve. It's gone to these rounds of feedback over time where we release a version of the API and the lifecycle and the pack CLI, people test it out, maybe write some dummy Buildpacks against it. Try it out with the Buildpacks that we've been working on at Cloud Foundry and Heroku sides and collect feedback on that, collect feedback from the Buildpack authors. We have internally too, and then bring a whole bunch of stuff and release it again.
Stephen: I think we're coming up to one of those cycles pretty soon where we're going to make some sort of large changes around how Buildpacks are distributed, and the communication mechanism between Buildpacks and also how apps are described with sort of application project descriptor. After that, we may be approaching a sort of a period of longer-term stability. I think we're heading a lot of the things that people have brought up in the last couple months. But we'll just have to see where we end up.
Emily: I think it's worth noting that some of those breaking changes you mentioned like around Buildpack distribution and the build plan are mostly going to affect Buildpack authors so from the point of view of end-users building their application using the pack CLI, the pack CLI interface will still continue to work for them and they'll be able to sort of consume these changes using a builder image, which is like a set of Buildpacks coupled with the lifecycle in a way that should insulate them from the breaking changes to a large extent. A lot of the breakages are most relevant for Buildpack authors.
Stephen: That's a good point. We've put a lot of work into sort of insulating app developers from the changes in the projects. If you're just looking to build apps using Cloud Native Buildpacks, may not be a huge change for you. You've got a project descriptor, you can use to provide more information, but it's also optional.
Terence: Yeah. I think some of the distribution stuff is not a breaking change but allows probably more power and flexibility, control around kind of the Buildpackage stuff with distribution that allows you to kind of package a set of Buildpacks together. We've talked a lot about modularity being an important part of kind of the new Buildpacks back and how we can break Buildpacks down, but without a way to kind of group all those Buildpacks together.
Terence: I mean, a little cumbersome I think currently today to be like I want this set of Buildpacks that used to be one Buildpack, but it's now broken up into a bunch of Buildpacks to kind of collectively as a user say, "I need to use that thing." I think distribution stack will help alleviate some of those pain points I think for customers today. Stuff that you don't necessarily have to change now but will I think makes it easier for app developers in the future going forward.
Stephen: Yeah. It makes sense. The interface for Buildpack developers and the distribution spec does kind of break a little bit because the Buildpack descriptor ... that format gets different so if you have the Buildpack now, you will have to make some changes to it to get it to work.
Ben: I think one of the big takeaways should be to remember that if you have created a running application image then that is not going to change. You're not actually going to have any breaking changes in there. What you have will be stable over time and as we evolve the Buildpack specification, what we use to create that image is what ends up changing.
Emily: Some upcoming features that might be of interest to application developers in the pack CLI include more ways to inspect the imagery generated and get different types of metadata so it going to make it really easy to print out, build materials for your image, and also get finer-grained metadata about specific components that are being added.
Stephen: We also have scratched images coming up in the lifecycle pretty soon too. I feel the Buildpack that supports this would be able to build a Go app or maybe a Java app that has very minimal dependencies where it's just the application bits and no operating system bits around it. We essentially build on a sort of a real, Ubuntu Bionic or whatever based image, and then generate the OCI image in the end so that it doesn't have those operating system dependencies like, for instance, your Go binary is totally statically linked. There's a sort of active PR for that right now but I think it's going to need some discussion before it can get merged in.
Joe: Speaking of PRs, pull requests, RFCs, for folks who are interested in following along with the progress of these changes and our releases and just our roadmap, in general, what avenues do people have for watching, getting involved, those kinds of things?
Stephen: The primary point of discussion that we really try to point people towards is the RFCs repo. I know you guys at Heroku are really big on, when we started the project as a way of sort of as a contribution model I think has gone very well. Where if you're a contributor to the project and because it's changing rapidly, you might not want to just dive in and start writing code. You might want to propose something, discuss it with us, figure out what the right sort of avenue forward is.
Stephen: We have this great RFCs repo where you can kind of propose whatever you want and start a discussion there and we have a process on getting changes approved that sort of follows the CNCF sort of neutrality rules and kind of makes sure that all contributions are treated equally.
Terence: Yeah. We sunset our original roadmap, repo in light of kind of the RFC stuff. We kind of moved a bunch of that work over.
Stephen: But you can also get in touch with us on Slack. We're very responsive. We really like engaging with the community so please, come say hi.
Terence: Yeah. It's Slack.Buildpacks.io. From there, you can join our Slack. We also have a mailing list provided through the CNCF.
Emily: Or, of course, interact with us on any of our GitHub repos.
Ben: For those that just want to try Buildpacks, I think Buildpacks.io has a documentation page where you can learn how to use the pack CLI and also how to author a Buildpack as well.
Joe: Cool. Well, I'd like to thank our friends from Pivotal for joining us here today and I'd like to thank you, the listeners, for listening to this podcast. Everything that we do in this project is driven by the feedback that we get from people who try the project, the users who build Buildpacks and author Buildpacks. We encourage you to take a look at Buildpacks.io and give Buildpacks a try. Thank you.
A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.
Java Language Owner, Heroku
Joe is the Java Language Owner at Heroku. He is the author of the Healthy Programmer, and a co-founder of buildpacks.io.
Product Manager & Staff Software Engineer, Pivotal
Stephen Levine is a project owner for CNB & project lead for Cloud Foundry Buildpacks. He is an avid Gopher, & runs the Build Program @ Pivotal R&D.
Engineering Manager & Staff Software Engineer, Pivotal
Emily Casey is a maintainer of the Platform and Implementation Cloud Native Buildpacks project subteams. She leads a team of contributors at Pivotal.
Lead, Cloud Foundry Java Experience, Pivotal
Ben leads Java applications on Cloud Foundry. He's been a core Spring portfolio committer since 2006 and a core Cloud Foundry committer since 2013.
Ruby Task Force Member, Heroku
Terence co-created buildpacks in 2011. He's organized conferences such as Waza and Keep Ruby Weird. In OSS, you'll see his work in Bundler & Ruby.
More episodes from Code[ish]
Anupam Dagar and Joe Kutner
The GitHub Student Developer Pack is a collection of free offers and discounts from dozens of tech companies including Heroku, SendGrid, Sentry, and TravisCI. Anupam Dagar is a final-year undergraduate student at Indian Institute of... →
Tanmai Gopal and Owen Ou
GraphQL is a querying language with the aim of increasing the productivity of frontend and backend developers. It can make working with React easier, be used as an API for third-party clients, and allow for feature-rich applications to... →
Parker Phinney and Julián Duque
Coding problems can be the hardest part an interview. Whether you're standing in front of a blank whiteboard or typing on your laptop as someone watches over video chat, the process can be nerve wracking for even the most skilled coder.... →