3. Spreading the Database Love
Hosted by Jonan Scheffler, with guest Brendon Murphy.
Spring has come, and chocolate bunnies aren't the only delights worth cracking open. Your database is filled with all sorts of data that's useful to your organization and your customers. But what's the best way to get insight into that information? You could connect directly to production and run queries, but one wrong command and your database could lock up, or worse, result in data loss. And what if non-engineering teams want to run their own analytics, or if external customers are asking for results pertaining to their subscribers?
There are ways to lower the barriers of access to your database. Rather than write one-off queries for your Sales or Marketing teams--likely requiring test suites and redeploys--you can construct safer SQL queries through a web UI. That data can also be shared through an API, so you can hook it up to a Slack bot, or even generate a unique password-protected URL, to share with people outside your company.
Brendon Murphy, CTO of Kajabi, talks about his company's experience with Dataclips . Instead of requiring developers to connect to their database, everyone in the company is able to generate analytics on-the-fly, and they even democratize the information via a Slackbot. Their marketing team is able to get real-time feedback on their campaigns through Lita.io 
The advantage of using Dataclips dovetails with their preference for using Heroku in general. While they could build their own wrapper to communicate with Postgres, or even manage their own infrastructure, they've found that the financial and operational costs are simply not worth it. By offloading this vital work, they free themselves up to focus on building features for their users.
Brendon concludes by talking about Postgres, and why it's the right choice for his team. It can act as a NoSQL document store via its JSONB data type, serve as a key-value store obviating the need for Redis, and comes with strict type assurances reducing the need for checks in the software layer. He also mentions add-ons provided by Heroku, such as PGBouncer , along with his feature request for how Heroku can better serve Kajabi's large data needs.
Links from this episode
Jonan Scheffler: Hello, and welcome back to Code[ish]. My name is Jonan Scheffler, and I'm a developer advocate here at Heroku, and I am joined today by Brendon Murphy from Kajabi. Tell us about yourself, Brendon.
Brendon Murphy: Yeah, first off, thanks for having us on. We're really excited for this opportunity. I am the CTO at Kajabi. I've been with the company about nine years since its inception in 2010. Kenny and I, Kenny's one of the original founders of Kajabi and he's now the CEO, go way back. We were actually roommates in college. We'd always wanted to work on something and then around 2010 he started calling me and saying, "Hey, I've got the opportunity we can work on now." For us being able to put together some stacks we really liked, those two primary ones being Heroku and Rails to start a company was really exciting.
Brendon Murphy: It was, yeah.
Jonan Scheffler: That was pretty early on in Rails land, I think, still, right?
Brendon Murphy: It was, yeah. I think at the time, I think maybe Rails 3 was just first coming out. The Asset Pipeline didn't exist, Heroku did not have implementation for Bundler. You would deal with Jim's another way. We've got to see an awesome evolution on the Ruby platform side on Heroku as well.
Jonan Scheffler: You mentioned that your friend Kenny is the CEO, Kenny Reeder, is his name?
Brendon Murphy: Yeah.
Jonan Scheffler: What is it that you do there?
Brendon Murphy: I'm the CTO, so I'm sitting over the tech side of the house primarily where we focus on the software itself that's running on Heroku, making sure that we've got good schemas, making sure the team's well staffed and equipped for what they need to do both from a headcount perspective. As well as knowing that the tech we've gotten in the background, what we have on Heroku, what we have on Amazon S3 is doing what we need it to do. Just overseeing the general management of the tech teams.
Jonan Scheffler: How many teams are there now? How many technical people do you have?
Brendon Murphy: That's a great question, we're actually growing right now. We're still a pretty small team, but we're growing aggressively. Probably by the time this airs, I think the dev side in the house will be up to maybe like 17 people.
Jonan Scheffler: Wow, all right.
Brendon Murphy: Yeah, and it's been great. I think that's an aspect of us that tees off of Heroku really well. I think the way our team works and for our needs and the speed we like to work, it's one of the reasons we stuck with Heroku for sure.
Jonan Scheffler: So tell me about Kajabi, what does Kajabi do?
Brendon Murphy: I think the easiest way to describe it is it were ... Well, I think most of your customers are familiar with Shopify, so I always use that as an example. You can think of it as Shopify, but for digital products. If we were to give more of the company elevator pitch, we would say it's an all in one business platform for delivering digital products online. What the majority of our customers are doing is putting together video courses. We have a customer, for instance, that teaches tennis classes online. His video team might go out to the tennis course or tennis court rather, and film a video of him talking about how to improve your serve and maybe he breaks it down into beginner, intermediate, advanced courses. Then we'll just use our platform to upload that online, make it real easy for our students to consume.
Brendon Murphy: We have native integrations with both Stripe and PayPal, so if you want to get off the ground running quick and start taking credit card payments, for instance, you've already got all your content, all you have to do is upload it and then connect with a Stripe Connect account and you're good to go and start charging money online for your courses.
Jonan Scheffler: I'm a developer advocate at Heroku, turns out I know rather a lot about this product. Let's say, hypothetically, I want to start selling courses on how to use Heroku? I'm sure my employer would be totally fine with this.
Brendon Murphy: Yeah.
Jonan Scheffler: Yeah, so I put together some courses, I make some videos and I decide what I'm going to charge for those things. Is there like pricing that is enforced by the platform or?
Brendon Murphy: No, and that's a great question. I think that's one thing. One Direction we decided to go that's a bit distinguishing from our competitors, is we actually experimented that at one point with the platform we built mid stage in the company. What we found were that our customers really like to retain their own branding and their own pricing strategies and things like that. While there might be some competitors out there who would say you're only allowed to charge between like 50 and $100 for your product or something like that, we're outside of that game. We let our customer charge what they want, it obviously has to be within reason because both Stripe and PayPal as well as other car providers have minimum and maximum prices.
Jonan Scheffler: Oh, I'm curious actually, I've never heard about this. What is the maximum price? You think I could charge someone $10,000 in one go?
Brendon Murphy: That's an interesting pick because last I read the document a few years ago, I think 10k was the max on Stripe. That seems a bit high for me. We do have a lot of customers because they're involving their training with some real life coaching as well as maybe it gets you ticket to event as well like you got for the weekend with VIP customers or whatever and talk about the things you've learned in person. They might be charging around like 2000 and that might be on the high side. But, yeah, I think in theory, I think Stripe and PayPal ... I'm not as positive about PayPal but-
Jonan Scheffler: That's interesting to me.
Brendon Murphy: Yeah.
Jonan Scheffler: Okay, these are [crosstalk 00:06:01]-
Brendon Murphy: I wouldn't advice that though.
Jonan Scheffler: Yeah, no, I will not set up my course and charge $10,000 for it. But this is the stuff you pick up when you're working in the space that you are that I wouldn't otherwise know.
Brendon Murphy: Absolutely, yeah.
Jonan Scheffler: Tell me a little bit about how you're using Heroku? You'd mentioned that because y'all makes extensive use of data clips, I'm curious about that.
Brendon Murphy: Yeah, we do and I think it's something that's evolved a little over time, both because data clips has evolved and we've evolved as a company. I know when we first started using it, we were still a small team, but the instant appeal to us was business analytics queries that we wanted internally. We do have some of our data in external business intelligence systems if we want to query, but it's not like the full data center or anything like that. We haven't found that need yet. We've been able to do sequel queries for a lot of that. It really served to bridge the gap at that point when we wanted to give access to different team members. If we wanted the sales team at the time or like the marketing guys to have access to that, it was a really easy way to grant that and to write the query and to not have to build it in the app.
Brendon Murphy: Prior to us using data clips, I think what we did a lot was we would have to write code that would go in and do the SQL query for us. I love Active Record in Rails because I think it's good most of the time. But a lot of time when you're just wanting to do a business query to get over your marketing and your sales team, you don't want to sit down and write like a robust suite of tests and necessarily have to go through active record and have to do a deploy for your internal customer and have access to that data. That'll make sense when you're developing software, you want that reliability there and a bit of those constraints and get in all of that behind it. But when the marketing teams like, "Hey, we just want to see a daily count of how many signups occurred yesterday." It didn't make sense to have to jump through all those hoops, it was much faster for us just to say, "Hey, we're going to drop a data clip and then we'll share the link with you so you can have access to it."
Brendon Murphy: I think the other advantage to using something like the data clip system in doing that, is I found that those queries tend to churn a lot. Like someone from marketing might say, "We want this query," and then we'll give it to him and they'll come back and say, "Hey, it doesn't look exactly how I wanted," and we might talk back and forth some more. It turns out, "Oh, they only wanted active customers. We didn't really communicate that up front." If that had to go through a full deploy cycle and test cycle and review cycle and all that to get out, it would be painful and unnecessary.
Jonan Scheffler: Data clips, I'm not sure if I actually explained this, data clips is actually a product that allows you to take a SQL query and you put it into this web UI and you get your results set back. It's very valuable to the CEO who is curious about how many users in New York purchase the falafel at our retail locations or whatever it is that you want to find out in any given moment. The value to accompany is that a lot of times these requests end up being actual code in your application that you've got to deploy, or else you've got to have some developer working on like an admin interface that allows people access to this data. Lets people dig through user data to identify trends and whatever market research they're trying to do. But when you are using data clips internally, you write a query and you send it off and then you're good.
Brendon Murphy: Yeah, and definitely these days we started out consuming it small team, so it didn't really benefit us quite as much back then. But today I'd say we're probably giving queries to the marketing team or direct marketing team. The majority of days per week, I think a data clip question will come up, the other area we used a lot because it does have this idea of privacy baked in, and there's something like a password. It really makes the URL secret, is we do have some customers either they got in a bind before, where they really need access to data and the app just doesn't provide it yet. It's on the near horizon but they really could use it for a launch and we don't want to leave them in that bind, or some VIP customers who really want that deep insight to data. We'll go and write a data clip specifically to their product and share it with them.
Jonan Scheffler: Oh, nice.
Brendon Murphy: Yeah, and it's really cool because we're talking about a connection, we'll put it on a follower database. We control the performance prams of the query, it's on our follower, it's read only. I think in the current version they can still see the properties of the query, but they're not going to be able to go in and change and say, "Tell it to return this column that we left out of the original clips."
Jonan Scheffler: That value of being on the follower database, too, I mean, if you're putting it out there in your production application, I guess you'll be using followers naturally like a leader follower configuration. But it's nice to know that nothing that anyone is doing in data clips will ever impact your production application.
Brendon Murphy: Yeah, absolutely. That's a big tip to leave people with. I forget where we learn that originally, it was probably some blog posts out there. But there's definitely been a few times since then, I think, where someone got a little daring with a data clip and tried to basically eat up memory unintentionally, and will get an email from Heroku that's like, "Oh, you're almost out of memory or whatever." But, hey, it's on our spare follower. I definitely recommend people do that and not run it against their primary. [crosstalk 00:11:45]-
Jonan Scheffler: I love those emails from Heroku that I get that say things like, "Hey, your database was horribly broken and your production would have gone offline, but we fixed it magically. I hope you slept well."
Brendon Murphy: Yeah.
Jonan Scheffler: It's one of my favorite features.
Brendon Murphy: Yeah, absolutely.
Jonan Scheffler: You were talking about data clips in the context of using it for business intelligence, but you also mentioned earlier to me that you were using it for your Slack Bot, which I thought was interesting because that's not a common used case. I don't hear it that much.
Brendon Murphy: No, and I think I'm clever that we came up with this somewhat hacky idea on our side, but it's actually working really well. For some background on this, we actually, I think started toiling with Slack Bots maybe three or four years ago. I think we were using the Hubot Framework and got into it a little. But it didn't really stick, like we had a shopping list or something like that, but nothing that was really deep for the company. And then I think we wrote another one later and Ruby using Slack or some low level Slack stuff. Then this latest iteration we're using the lead IO framework, which if any of your listeners out there don't have a Slack Bot, but they do use Ruby, I'd highly recommend you check out Lita.io, it just-
Jonan Scheffler: Spell Lita for me.
Brendon Murphy: Yeah, it's L-I-T, as in Tom-A
Jonan Scheffler: Okay.
Brendon Murphy: So, Lita.io. It's awesome and it makes it a little more abstract on higher level. Now, one of the things we got really excited about on our site once we saw the excitement that people were having around the bot in general out there, the marketing team when they started doing more campaigns, they wanted some more real-time feedback into how it was doing. Now in the past, we would always use a data clip to do that, right, we'd give it to the marketing team. They could share it amongst themselves and then log into the web page and see it. What we started moving towards was providing some handlers in the Lita bot that could answer those questions for specific campaigns. Or we could give them the data before, this was a little cooler because people were doing it in public Slack channels. There's definitely a difference between two users who maybe are setting a few desks apart, and they pull down that marketing stack that says how much success you had yesterday versus somebody doing in a public Slack channel? Because that really helps to build excitement around it.
Jonan Scheffler: Right, and you're disseminating information passively then, someone happens across it, they learn that the Slack Bot is capable of doing this and they're doing their own stuff.
Brendon Murphy: Yeah, so the cool part about really that marketing style data was at first when I started needing it, I thought, "Okay, well, this is going to be a little annoying because I'm going to have to build out an API layer somewhere either on third application and then expose our database to that application." Or we can just build it in our primary application, then we have to deploy. None of that really sounded exciting to me, like if we're rolling quick and it's Friday and we're down to the wire launching a new marketing campaign, I don't want to have to be like, "Okay, cool, you can talk to the bot but it's going to come out and deploy like two and a half hours from now or an hour from now. And have to go through that cycle. What we realized was since we write all this data in data clips, let's actually just keep doing that and then what we do on the back end is we just have maybe 20 lines of wrapper script around Ruby wrapper script. I think we use a lightweight REST client to call data clips basically as our database.
Jonan Scheffler: Instead of going straight to your production database, you're going through data clips through the API?
Brendon Murphy: Yeah, what we do in the application as we have a map that says if we want to get data about partner sales for yesterday, like the summary layer, here's the Heroku data clip URL. Then the client on our side, we just tell it, "Hey, go fetch that URL." I think it's a little lower level clients so it needs to fetch the CSV redirect, it's authenticated to S3. It grabs that and actually I said CSV, but actually we use JSON, it's a little easier.
Jonan Scheffler: Oh, that's right, yeah, of course.
Brendon Murphy: Yeah. We grabbed the JSON that you guys have stuffed on us, I think it's S3 for that and they just parse it out a little And wrap it in some lightweight data structures on our side. But it's awesome. There was one night, I think it was a Thursday during that day, we started a new marketing campaign. I think we deployed it that day, and then that night I was remembering, "Oh, hey, this went live tonight and we don't have a way for the marketing team to see it and get excited about it." I think I just spent maybe 10 minutes, added an endpoint to the Slack Bot that can answer that question. Then write the data clip on Heroku and just so within 10 minutes, I was able to update our Slack Bot and deploy it because it's got a really small test suite. It's only for internal use, it's not for production costs, right.
Jonan Scheffler: Right, yeah, so you just shipped to the bot and then it worked?
Brendon Murphy: Yeah, and it was super satisfying. I didn't have to go and update an API or make sure we could answer all the API questions that we needed for that end point. It was just like let's write a data clip, let's get this running, and that was really satisfying to realize that we had done in five to 10 minutes. What might take us a little more time, that's really not necessary for that thing. You want to be much more careful and methodical when you're talking about working on the software that is powering your core platform, but when it comes to being fast and furious about marketing campaign, you don't need those same levels of safety, even though I think we've got enough safety in data clips with the privacy and the version history and stuff like that.
Jonan Scheffler: Absolutely, and those are things that I think looking at data clips, it's a pretty simple product, right?
Brendon Murphy: Yeah.
Jonan Scheffler: We as developers, we look at that thing and we're like, "Oh, I could build that thing." But then shipping it is not necessarily the hard part, you're maintaining it and you're adding those privacy features and those security features. Now we want it to present JSON and those are all things that take time. My case to people has always been that like, "Yeah, you can build your own data clips, you can build your own Heroku from scratch, but you are uniquely qualified to be building features for your users. You understand your database, you understand your users, you can offload the construction of data clips and every other feature for Heroku to us, and then free yourself up to be doing ... I actually think that it's irresponsible behavior as developers to be spending our time on those kinds of things. I love it, to be clear, it's super fun for me to build those tools. But if I have the option to buy someone else's tool and use theirs and then spend my time building actual features, I feel like I have an obligation.
Jonan Scheffler: I used to work on the tools team and every time I deployed an app I got an email from ... Before we, there's like a production checklist you go through before we're able to launch the app into production, get an email from the red team and they're like, "Hey, we just pulled apart your Swiss Cheese app. Nice security, how about you fix all these holes and then we'll ship it."
Brendon Murphy: Yeah, and we could do that and we could maybe ... Maybe there are a little extra things we could get going to something like Metabase or some other product but I still don't feel like we're at in this specific area of need. It's not worth it, we're trading like some minute gains for lots of operational and, especially, security complexity around that.
Jonan Scheffler: Yeah, building it is not necessarily the expensive part but securing it, I think the expensive part might be when you leak your user accounts and get sued and you find-
Brendon Murphy: Yeah, well, and you know the question is when does securing it stop? The answer to that is never.
Jonan Scheffler: Never ever.
Brendon Murphy: As soon as you sign up for that, you're now perpetually signed up for that.
Jonan Scheffler: From that perspective then, let's talk about some other features of Heroku? You are using followers you talked about and how ... You mentioned to me that you had, had some problems storing too much data in your database or running backups against primaries and lock tables. There were some difficulties around using Postgres, generally, or ... First of all, why are you using Postgres? Why did you choose Postgres database?
Brendon Murphy: Well, honestly, part of it I think is history. At the time we went with Postgres on Heroku, it was the offering but I think the other thing is, and I think this is borne out, is they've really proved to be I think the best Open Source database team. I think at the time we started, there was maybe a little more argument around that between MySQL and Postgres.
Jonan Scheffler: That's fair.
Brendon Murphy: Yeah, even pre-2010, I think there was some rumblings around some of the MySQL licensing issues and things like that. But in 2010, it had the sheen of something that was also going to go places so that was more of a gut feeling on our part, but I think that certainly bore out.
Jonan Scheffler: It worked out pretty well for you, Postgres is a one true database in my opinion.
Brendon Murphy: Yes, how many articles have you seen in the past like four years that have the line somewhere in them, "Tell me why you need Mongo anymore?" Because Postgres has this. We actually early on started using some of the cool features, and I know MySQL and other databases have their own implementations of this, but Postgres' implementation of this has turned out rock solid. But I think in many of our applications we're using H-Stores, we're using different JSON data types.
Jonan Scheffler: If you're using those kinds of, if you knew that document store, then why not use the Postgres JSONB columns, you could do your Map-Reduce queries against this binary JSON format that you're storing in there?
Brendon Murphy: Absolutely.
Jonan Scheffler: In my opinion, entirely supplants the use of a thing like MongoDB.
Brendon Murphy: Absolutely, and here's the other thing I think, and maybe this isn't so Postgres specific as it is relational database specific, but if I can get that Acid performance of the database, especially, for the use of something like transactions to get us that reliability out of our database, and we can have performance and do that, I love doing that. Any time we have to break out of sequel transactions and manage that in an application layer or a distributed layer, it's possible but I've seen it. It just it never does as well as if you can handle that in the transaction of Postgres.
Jonan Scheffler: I am a huge fan of Postgres generally, I think you made the right choice. But 2010, you're right, was a pretty lucky time to make that decision. Because I don't think it was clear at the time that Postgres was going to win. But if I have need for something like Redis, I can use Postgres for that too. I can just turn off App Store and it's actually a faster version of Redis. I just I wish that I could convince more people to start using Postgres and quit mucking about with all these other data products because the number of times that I hear from people that we are outgrowing our NoSQL Solution, we've gotten to a point where things are too complicated and we need out. It's just enforcing, I think, at the software layer, your relationships, a lot of people say that these are like schema lists, which is not actually true. It's just enforced at the software level, they do it in the ORM, instead of in the database itself.
Jonan Scheffler: I like it when I try to write a nil value to an email column in my database when my database actually throws up. I want Postgres to say we do not put nils here. But that value is not offered by something like a large document, so you've got do that with the software layer and things can get pretty complicated in the spaghetti like as you get bigger.
Brendon Murphy: Absolutely, I think too to extend the answer beyond just the technical abilities of Postgres, you asked why we chose Postgres? It definitely wasn't chosen in a vacuum, it was chosen because we knew that Heroku would be operating it and have an operational team behind it and on call teams. What was really intriguing about the value add of Postgres was the operational capabilities that Heroku has behind it.
Jonan Scheffler: Those emails that I get in the middle of the night or I wake up in the morning and I've got an email that's like, "Hey, your database failed, but it was no problem. We promoted a follower and you had no downtime. You're welcome."
Brendon Murphy: That's a great point and I think it's something it may be sometimes when discussions about, let's call it total cost of ownership, of running a database on Heroku come up that sometimes I think they might get missed and sometimes I wonder if maybe that's due to background and experience. Before I came to Kajabi, I was working on a Rails app and I was a programmer, but I'd say I was a bit more of a DevOps type, even though DevOps didn't exist really at the time as a term. Let's say I was a systems engineer, so I was maintaining hundreds of DNS and SMPP servers, MySQL servers for different internal and external needs and a large Tier 1 ISP. That was my first real technical job and it was the first time I carried an on call originally as a pager and then a cell phone for work.
Jonan Scheffler: Oh, that was so fun for the first day. Didn't you feel important?
Brendon Murphy: Yeah.
Jonan Scheffler: The first eight hours you're like, "Wow, this is great."
Brendon Murphy: Yeah, you thought you were so important, but then when you realize that your team's also responsible for the power in your local data center and the data center goes down and then the power goes down, and suddenly, none of your servers and the battery backup doesn't work. You have to go turn off like 200 servers-
Jonan Scheffler: It's your fault.
Brendon Murphy: ... when it's raining and it's lightning at 7:00 p.m. or something like that. I'd had a lot of experience before with the pain points of failing hardware and failing platforms and things like that. It's not what interested me, what interested me was working on the software and casting all that other stuff aside. Leaving that, I mean, yes, it's still our application, we still take responsibility and dive in on that but we have that peace of mind that we've got an ops team behind the problem that's bigger than ourselves.
Jonan Scheffler: Right, and you don't have to worry about having a team of your own. I think that total cost of ownership is the piece that's hardest for me to convey when I talk to people. Is that building it is not the expensive part, you could probably cobble something together. You're going to put together a deployment pipeline when you're starting your company. But that's going to take you a lot of time. If I think about like if you and I are sitting here today, we come up with a great startup idea. Let's say we ship our MVP in two weeks. Probably a week of that, under normal circumstances, is us setting up servers and I am profiles and getting everything organized for our deployment pipeline. Being able to skip that is a huge advantage, I think, for companies [crosstalk 00:26:29]-
Brendon Murphy: Absolutely, and I think too, I mean, our Heroku bill, I'm assuming is probably one of the ... It's towards the high tier on Heroku, we're probably pushing it a little more than the average Heroku customer. But even at that high end for like a Heroku bill or something, it's still wouldn't really be enough for us to go out and hire maybe two or three experienced DevOps people.
Jonan Scheffler: Which is what it would take to build and maintain your own infrastructure easily.
Brendon Murphy: Absolutely, you can't put two people on call for a real infrastructure, you're going to at least need to have three, and our bill right now doesn't even cover three people. It would probably barely covered to people.
Jonan Scheffler: You could put two people on call, you just wouldn't have any other company very long.
Brendon Murphy: If you want to build churn into your organization.
Jonan Scheffler: Exactly, yeah. And you're doing for churn. Tell me about these problems that you had with Postgres?
Brendon Murphy: Yeah, so I think you had mentioned the one about backups. I can't remember if this is in the Heroku doc somewhere, and it's one of those things when you hear it, in hindsight, you think, "Well, gosh, that's obvious. That was silly on our part." But we've been doing backups on all our apps for a very long time against the primary data store. Yeah, and the primary data store for our newest and most successful application is very, very, very large by Heroku Postgres standards. It was taking hours to complete, so we still have full protection from those databases, it wasn't necessarily breaking the app or anything like that.
Jonan Scheffler: Sure, it just slows down your queries.
Brendon Murphy: Yeah, well until one morning when we did a deploy and there is, I think, a lock that the backup had taken in order to do its backup effectively. We had, I think, an index add, which we always add our indexes concurrently because that's safe. But adding this new index in conjunction with the lock that backup had, in conjunction with us doing a deploy, created this perfect storm. It really slowed our app down for maybe 15 minutes.
Jonan Scheffler: Wow.
Brendon Murphy: But the solution was easy, we read Heroku and was like, "What's going on?" I think you all got back to us and said, "Well, you've got backups running on the primary and they went to war with each other a little." It's awesome that the solution is easy, we just don't take a backup off that, we take it off of follower, which we already have in place. If we didn't have in place, we would have got it in place that day and cut the back over to Heroku from that as well.
Jonan Scheffler: That was a pretty quick fix to find, you just emailed support and they got back you. They just said-
Brendon Murphy: Yeah, I think we've got an SRE name brand out in Florida that handles a lot of this and that was a pretty fast return time on that one, yeah.
Jonan Scheffler: Nice, so why would you use something like Postgres on Heroku instead of RDS? I think we've covered a little bit of this here but I want to ask the question.
Brendon Murphy: Well, I think, one thing to keep in mind is I'm not intimately familiar with RDS these days, it's a bit antiquated so there might be some ops people out there that say you're making this sound harder than it is. But from my perspective, there's a lot of tooling that hurt ... When you say Postgres, I think you really have to say Heroku Postgres because it's all the backup tooling around it, it's the point in time backups.
Jonan Scheffler: You're right, it's not just the database sending more.
Brendon Murphy: Exactly, yeah. For us, the reason for us to use it is because we see it as a higher level abstraction that we're using. We do still need to think about it at the application layer, as a Postgres database. We have to go in and optimize our queries and things like that. But the ability to spin up a fork or a follower through a pretty high level command with the Heroku command line with just the same tool we use for maintaining other parts of our app like domains and stuff like that. Makes it pretty welcoming to developers and feel pretty good for them to be able to do that. The ability to issue a command and then securely get a SQL console on one of the databases if we need to execute SQL command for some diagnosis purpose.
Jonan Scheffler: I know that it's safe that what you're doing. You, of course, have the capacity to connect your production database on Heroku. We give you infinite ability to shoot your own foot off, but you can do things in a safe way. I just I like very much the way that the follower infrastructure is set up because I can know that I'm not going to damage things.
Brendon Murphy: Absolutely, and then I think there's another feature that when we learned about it, it ties in with old folk follower, but Heroku has point in time restored which I think were there for quite a long time before we ... We had this half hour call with the data team and we asked the question at one point and we said, "Did you know you can restore to a point in time?" Which has been a huge lifesaver for us.
Jonan Scheffler: Awesome.
Brendon Murphy: I'm sure we could build that ourselves or hire ops people to build that, but the fact that it was there when we needed it and it was easy to use and that you guys even thought to build it was ... Let's put it this way, it really saved our bacon very well on quite a few occasions.
Jonan Scheffler: The time when you need point in time recovery for databases, you're certainly glad it's there. I feel the same way about the releases, the rollback.
Brendon Murphy: Yeah.
Jonan Scheffler: My first deploys, I started writing software maybe eight years ago, so it was early on. But I remember distinctly staring at a New Relic graph, while I pushed the deploy button and being like, "Please work, please work." Because if you didn't get to unroll that deploy, it took us a minute. If we deployed something bad to production, it took us a long time to fix it and we had real downtime, and now I have this instant go back Undo button that saved me so many times. When I've done something silly, I was up too late coding and shouldn't have been working on the app anyway. I really appreciate those features that protect me from myself.
Brendon Murphy: Absolutely, and some of the listeners might be wondering at an application layer some of the other areas that's useful. Where it's been really useful for us is if we had to do some maintenance cleanup maybe because of a bug, and we made a mistake. Probably one of the mistakes we made before this is a couple years ago as we were going to clean out some spam form submissions for a customer and we just we dropped the ball that day and over cleaned. Fortunately, it wasn't a really deep impact on the customer but we had ... I think this was about one month or maybe three weeks after we first heard about this capability and we just looked at each other and said, "Oh, this is no problem. This modeling is not that complicated." We just found in our log where we issued that delete and then spun up a fourth database that was one second prior to when that delete started occurring.
Jonan Scheffler: Oh,, nice, you just dumped the data back out.
Brendon Murphy: Exactly, and to tie it all back together, if people probably think I sound like a Heroku out at this point, I loved how well this worked. I'm trying to get this back up, I found out about this problem when I came into work that morning, so my brain still foggy and everything. I'm thinking, "Okay, so now I've got this other one spun up, I need to reconnect an active record adapter to a secondary database." I was googling for a couple minutes, had to do that because that's not something you typically do on a Rails app, right?
Jonan Scheffler: Right, yeah.
Brendon Murphy: You forget how to do it and have to Google and remember. And then I realized like, "No, wait a second, I'm only going for about 200 Records here." Again, I went back to data clips, and I just I halted my Google search because I realized I can figure this out and it would probably be nice to know, but I'm not going to figure it out as fast as I want. I just made a data clip to go and generate me a CSV of the 200 rows or whatever we'd accidentally deleted for this customer. Then exported it as CSV, and when we did the fix up on the production application, it was easy because now we just had CSV which is very passible. We just ran the Ruby CSV parcel over it and had a lot of confidence that we're good to go.
Jonan Scheffler: Depending on the CSV, it is very passible. I have so many problems look like the Open Data Set CSVs and things. But that's awesome to sidestep this. I was trying to think of how to do that, I can't remember how to connect to this data, but I would definitely have just gone with the data clips thing had I found it.
Brendon Murphy: When in doubt, data clips.
Jonan Scheffler: You had these issues with the Postgres thing, we're learning experiences I think that were resolved quickly for the organization.
Brendon Murphy: Absolutely.
Jonan Scheffler: Now you are happy with the way things are on Heroku. If you were in your dream world, if we add a feature in the next year for you around data, specifically, what would we add?
Brendon Murphy: Potentially larger data stores. I think for us it really comes down to the size of data on disk could potentially become an issue. But there's other ways to tackle that too, one is we know that on our side the way we're storing some particular data, we could do more efficiently. On the one hand, I want you to come out with more space if we need it, but on the other I like having that lower limit as a motivator for us to look at ways to store that in a more cost efficient way.
Jonan Scheffler: This is the way I feel about a lot of features on Heroku, actually, because Heroku adheres pretty strictly to the Twelve-Factor, the tenets of software development described in Twelve-Factor. They really are a pretty opinionated platform, occasionally, I find myself in a situation where I'm thinking, "Man, I really wish I could do this thing." But I know better, I know every time I'm like, "Ah, you shouldn't do this bad hacky thing that you want to do." That's why Heroku is preventing you from doing it. In this case, I want to put a password into my code base or something. This is an obvious example and nothing that I want to do anymore, but we force you to keep those in environment variables, and so if you have that problem, it can be something that you have to overcome. But I agree with you that the upper bound for the size of a Postgres database, I'm sometimes shocked how many customers were able to hit that.
Jonan Scheffler: It's usually because they've just jammed everything into a single database. They don't have much of a service oriented architecture split out yet. When you break things down into micro services, it makes things very easy. You've got a service that handles that users table which is going to be quite large or whatever the big one is for each company it's different, right?
Brendon Murphy: Yeah. It's an interesting question too because retroactively I might have answered a little differently. One of the other problems we ran into recently were just saturating the database server with connections. I think on the tier we're on, I think we cap out in maybe at 500 connections and it's a hard cap. I think two of those get reserved by Heroku for administrative purposes. I think you've got maybe 498-
Jonan Scheffler: We did short you two if you're 500, that's true.
Brendon Murphy: Yeah, it's like getting a computer with a hard drive in the OS.
Jonan Scheffler: Oh, the worst.
Brendon Murphy: It just it happens and you learn to live with it. But so that was actually turning out to be an issue for us because we have a Procfile, it's not huge, but I think it's about like nine or 10 lines long at this point. A lot of it is just different psychic workers. We found that we organize our stuff both into priority but also some of the work we isolate from each other entirely apart from the database by just running them as different workers. Because we found, "Hey, if this worker gets locked up, that's fine." It's like a half step to microservices with background workers on Heroku.
Jonan Scheffler: Right, it's again like transition.
Brendon Murphy: [crosstalk 00:38:33] exactly, it took to making that context. The downside was because we're using that in conjunction with auto-scaling, we just hire to scale our web and our background worker dinos. We were hitting that limit sometimes. The rhetoric active answer is I would have told you connections, but now that Heroku is I think it's out of beta or it's nearly out of beta, the Heroku PGbouncer being built and maintained by Heroku, that's the obvious answer there. That's what solve the problem for us. Initially, when we were looking at it, we're like, "Okay, how can we make a build pack and set this up ourselves on Heroku?" We didn't really want to do that and that's when we found out that Heroku had the capability to really handle it on our behalf. We went from having deploy issues around database connections and scaling and stuff like that, which would have gotten even more problematic for us now because at that point we were not using Heroku Preboot and now we are.
Brendon Murphy: When we do a deploy now we really can see database connections, but we don't really anymore because PGbouncer is really cutting that down for us. If you've got other customers out there, if there's other people listening, they're like, "Man, we're really high on these connections." I'm sure they would have heard about it, but they definitely want to look into PGbouncer if they haven't yet.
Jonan Scheffler: I love PGbouncer especially because the Postgres docs for it describe its brutality level, you can set up brutality level. How crushing do you want this hand that destroys your connection pools. I like it.
Brendon Murphy: That's funny.
Jonan Scheffler: Well, Brendon, I think we have covered most of the things. If there's anything that you wanted to add, though, did I miss anything that you wanted to talk about?
Brendon Murphy: No, I think that's great. I think the one thing I'd add in a more general level, I think that's interesting about our experience with Heroku and this, like I said, this is general, it really goes beyond just Postgres. Is we like having that abstraction layer for our engineering team of what the platform is. I think the benefit it's provided to us, you alluded to it earlier, one of them is obviously being able to focus on the application and not the operation of the application. It's for the scale we're at now and I think the scale we're going to grow into aggressively over the coming years. We're still going to be able to do that on Heroku. It's great for that, but one of the other really cool things, you've mentioned the Twelve-Factor app early on. I think Adam Wiggins, that name I'm sure rings a bell for a lot of first employees and customers. I think was one of the original co-founders-
Jonan Scheffler: Co-founder of Heroku, yeah.
Brendon Murphy: ... from the original team. I remember him giving a talk in like 2012 and one of the things he was talking about is in line with the Twelve-Factor app. He was saying you want to live close to production and the message I took away was twofold at the time. One was he said you want to be able to have your environments similar to each other. This was even before I think people were really starting to get into Docker and containerization and everything. Heroku, I think, did that part really, really well, especially, at the time. But I think the other area we saw a lot more in our team was they got to live close to production and that they would take ownership of bug issues. There wasn't really a barrier to entry on that, they didn't have to go to another team, it wasn't another team's responsibility to handle. I think that level of ownership in the end is a lot better for your company and its customers.
Jonan Scheffler: I agree.
Brendon Murphy: I remember at my last job, I started there, and this was like my first week working there and they were talking about ... They were dropping insider information at some meeting, one of the guys said, "Hey, if you're ever on a system and it's spiraling out of control," and this was coming from [inaudible 00:42:21] Assistance Team, "see if this one programmer is logged in because they've been known to cause problems." It was interesting because we had this dichotomy at the time of like there were developers and then there were the engineers who ran the servers. We used that quite effectively to build angry silos with each other at the time. It's cool that we, don't really have that now. The people that are operating the app are also the developers, they have direct ownership over that bug. They're able to login and check out performance metrics at the application and things like that.
Brendon Murphy: I think the platform's really enabled us to have that level of interaction and ownership still that's really been beneficial for us as a team and culturally as well.
Jonan Scheffler: This is one of my favorite parts about when I was at Heroku working on an engineering team, I've been doing developer advocacy work for a couple years now. But one of my favorite parts about it was that you owned everything. I talked about in the Ruby community it's very common for development teams to own their test infrastructure, for example, whereas like older shops, maybe people who operate a little bit more waterfall will have like a QA team. There's a QA phase even where the developers basically throw their code over the fence and they're like, "All right, y'all test it, make sure it works."
Brendon Murphy: Yeah.
Jonan Scheffler: Owning the code, I think it's really important for a developer to own the entire stack because they understand the entire stack and they're also responsible for owning their own bugs that they're writing into there. If you don't have your unit tests that would have caught your bug, then you suffer the pain of having that page come later. I feel the same way about the Heroku platform in that when we internally deploy an application, if our team owns an application, we own it forever. There's no team that maintains that ... There are plenty of, I talked a little bit about the red team, there are plenty of people who specialize and they come around and they help us out but, ultimately, those security fixes they're on us as a team to shift and to own. I think giving people that level of ownership on your development teams is a really valuable tool for preventing those kinds of silos. Because I've seen it happen over and over again.
Brendon Murphy: Absolutely. Yeah, it's you look as companies grow and I think one of the common problems every single company out there has is they say, "How are we going to figure out how to communicate better? How are we going to figure out how to get transparency at the information in the organization." I feel like since day one, at least, with the interchange between the developers themselves and what's going on in production, there's no chance for that visibility not to be present. It's really baked in because they have that level of access even though it's still safe and secure.
Jonan Scheffler: In an emergency, they know how to get in there and fix things and we deploy an application or restart an application. I remember at one company where I worked it took us like seven steps to get the operation out the door, and then rolling it back was not necessarily something that someone could do who had just started up a company. It may have taken me three months to get around to learning how to roll back a failed in production deploy. Those are things that are trivial on Heroku and really important to spread control and ownership.
Brendon Murphy: Absolutely, yeah.
Jonan Scheffler: Yeah. Well, again, thank you so much for taking the time Brendon. This has been great. I really appreciate your insight. It's funny when we're working on this side of the house at Heroku, it's hard to see our product through our customer's eyes sometimes, so I really appreciate your insights.
Brendon Murphy: Well, it's been an honor being on, I'm really glad you asked us to jump on the show and really love the product. Please continue to making awesome products.
Jonan Scheffler: We will keep building.
Brendon Murphy: Yeah, we really appreciate it.
Jonan Scheffler: We will try, maybe we may not ... I can't promise you larger databases, we may have to keep ours small to restrict bad behavior, but we'll see.
Brendon Murphy: Yeah.
Jonan Scheffler: Whatever we could do, I'm sure we'll find a reason to talk again soon. Thank you for joining me.
Brendon Murphy: Absolutely, thank you.
Jonan Scheffler: Take care.
A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.
Developer Advocate, Heroku
Jonan is a developer at Heroku and an aspiring astronaut. He believes in you and your potential and wants to help you build beautiful things.
More episodes from Code[ish]
Matte Noble and Idan Gazit
We often think of an integration as communicating with just a single API, but it's not uncommon for some applications to pass data between several platforms. Sentry does just that, acting as a conduit between different services, so that... →
Shirley Xiaolin Xu, Eric Chen, and Chris Castle
Transitioning into a career in tech can be intimidating and challenging, but everyone starts somewhere. On this episode, Chris Castle chats with Shirley Xiaolin Xu and Eric Chen about their experiences as Junior Developers at Heroku. After... →
Josh Aas and Craig Ingram
Let's Encrypt is a certificate authority with a humble goal: to make HTTPS encryption ubiquitous across the web. Co-founder Josh Aas is on this episode to tell us about how TLS/SSL works, the challenges Let's Encrypt has faced, and potential... →