1. Running Grails in Production

Heroku in the Wild
March 28th, 2019
Episode 1
27:35

Also listen via

1. Running Grails in Production

Hosted by Chris Castle and Joe Kutner

Andrew Garcia is the co-founder of Goodshuffle, and as one of the first Grails users on Heroku, he worked closely with Joe Kutner, Heroku’s Java Platform Owner over the years. They chat with Chris Castle, a developer advocate at Heroku, about Goodshuffle’s experience with building a startup on top of the JVM.

When building an application, it’s often tempting to reach for the latest and greatest technologies to build your app. Andrew Garcia argues for something different: by using “boring” technology–that is, languages and frameworks that have been around for years, not months–you can iterate much more quickly on features. He’s chosen JDK8 (released in 2014) to run Goodshuffle, a startup founded in 2013 to help event companies manage their business operations.

Goodshuffle uses frameworks like Gradle and Angular because of their stance on convention over configuration, which is another opportunity for being more productive. The more reliable the tools you use are, the more you can focus on your users needs.

Links from this episode

Show Notes

Chris Castle: Hello and welcome to the Code[ish] Podcast, where we talk with developers and engineering leaders about a variety of topics, ranging from languages and frameworks, to data, and event driven architectures, to individual and team productivity. I'm Heroku Developer Advocate Chris Castle, and I'm joined here by some stellar folks. We've got Heroku's JVM platform engineer Joe Kutner, and our special guest today is Andrew Garcia, the CTO of Goodshuffle.

Chris Castle: We're going to discuss Java, Grails, Gradle, and Goodshuffle's tech stack, but first let's do some intros. Joe, can you tell us a little bit about yourself?
<!– more –>
Joe Kutner: Sure, I'm an architect on the languages team at Heroku, and I own the job experience on the platform. So what that means is, if you've ever tried deploying a Java app on Heroku and it didn't work, it's probably my fault.

Joe Kutner: So I'm an engineer. I do work on the components that make the JVM run, and I also do some support and work with our customers to make sure they're having a good experience.

Chris Castle: And then Andrew, can you tell us a bit about yourself, and a bit about Good Shuffle too? And then I'll let Joe jump in with some of his questions after that.

Andrew Garcia: Yeah sure, so I have been a developer since I was 12, but I suppose more professionally since 2005. Was in government contracting for a while, previously was dealing a lot of early big data, data visualization issues, kinda when those were first coming about. And in 2015, my co-founder and I, we went full time on Good Shuffle. Before that we'd been working in coffee shops for a couple years, and we just discovered a really awesome opportunity in the events space, and decided to focus our efforts building technology for that.

Andrew Garcia: So now we're a two product company. We have Goodshuffle Pro, which helps event, rental, production design companies etcetera create quotes, contracts, invoices – just effectively manage their entire business operations. And then we have goodshuffle.com, which is a marketplace B to C, kind of like an Amazon for event rentals. So if you need tents, tables, linens for a backyard party, or a fundraiser, you find the vendors in your city and book it online.

Andrew Garcia: So it's fairly complex, where one product is ERPCRM, and the other one is a marketplace style product, and we're building and deploying both of those on Heroku, and they share the same backend stack as well.

Joe Kutner: Why did you choose Java? Why did you choose the JVM for your infrastructure? I mean, obviously, I love the JVM, it's got a long history, but there's a lot of other choices out there.

Andrew Garcia: I think it's a combination, I mean, there's a kind of a holistic answer to that I suppose. It would be hard to deny that you know having an enterprise background myself dealing for you know, building software for large customers, Java was kind of the prevalent language at the time. Back in the day, Rails and those other frameworks were still kind of early. They're weren't a lot of gems and support. So, Java for me at the time there was just a lot more library support for that, to tie into payment networks, etc. Also, it was something that I already knew. I mean I was very familiar with Spring and Hibernate but, a coworker of mine had turned me onto Grails at the time and I appreciated how much that framework gave you the benefits of Spring and Hibernate but isolated you away from the myriad of XML files to config every single bean. Or one too many relationships and Hibernate, etc.

Andrew Garcia: So that would probably be one deniable answer to the question. Just previous experience. Also, I kinda just at this current stage, I think it's when I'm talking to developers that are getting started out. I mean honestly, to a certain extent I'm just so busy building, I'm not really tuned into following the latest trends from a framework perspective. But I hear things sometimes that are a little concerning. When people are like, "Oh, I really love JavaScript" and I don't know. To me that just sounds odd to me. Like a carpenter saying I really love nails. And I'm like yeah I guess it kinda depends… when somebody says that to me, and I'm like what kinda problem are you solving? Or what's your team structure? Or your management structure? How are you running your project?

Andrew Garcia: So anyway, all that to say that it made sense at the time I guess which is kind of, should probably be on the tombstone of every developer. It made sense at the time, I've lived with it, it's been able to give me what I need, I've been able to get the response times in performance to grow the platform. And thankfully the decision of Grails JVM and deployments to Heroku etcetera have all put me in a position where I can build features incredibly fast and not worry about those technology choices so far haven't created any significant tech debt that I've had to deal with.

Joe Kutner: Yeah, that makes a lot of sense. I think Grails and Heroku have something in common. And that is a tilt towards being opinionated or convention over configuration. And that probably comes from both having an origin with Ruby on Rails. Do you find that that's what suits your development style to take what a particular framework or platform chooses as a happy path? Or do you like having more control?

Andrew Garcia: Yeah. I would say particularly when you're dealing with the framework that's in its later versions. I mean I would certainly be wary of something that's in it's .9 version or 1.2. But at the time I was getting involved, which I think was already in the 2 dot x range. I was recently talking with a developer that we'd brought on that had come back, or come from a really heavy React-Redux background. And our frontend is in Angular and so I think there's probably kind of a little bit of a different. It's kind of the same theology there, I mean Angular is a little more prescriptive with how it wants you to develop its front end. Whereas, and from what I understand, the React posture is like well, there's a general framework but you can get any number of implementations from any number of different places. So, even though you can tell someone, Hey I built it in React you'd have to have a series of follow on questions that ask, "Okay what plug in did you use for this, this, this and this?"

Andrew Garcia: I've certainly never felt restrictive in playing around in the Grails sandbox.

Joe Kutner: Is Grails more like, in that analogy is Grails more like Angular there? In that it's, there's a happy path for everything. It's not like React where you have to make a decision about how you store state. And how you do a bunch of other things.

Andrew Garcia: Yeah. I mean to me it does exactly what I need it to do. I mean if I need something that just behaves inherently different then – I guess a good example would be Grails had built out an eventing architecture inside it. Like a publish-subscribe mechanism. And I dabbled into it a little bit and I was like this feels odd. Like why am I not using something like a JMS implementation like Rabbit MQ or something that is actually meant for event driven architectures. It ended up that I was right, I ran into some particular issues with the Grails where if you're running it in a multi-node environment where servers are coming up and down you can have a message being published and you don't necessarily have guaranteed consumption of that message.

Andrew Garcia: Or guaranteed deliverability, which is a huge problem from an event driven architecture perspective. So I effectively killed that feature as I was using it in Grails and just migrated to pure Rabbit MQ solution to manage offloading certain jobs to worker dynos. So every time I've come across something, I'm like, you know I'm using Grails exactly what I need it to do. Host the controller service logic, invoke data sources. But if I need distributed locking I use Redis. If I need indices I use Elastic Search. I mean everything has its purpose, I think when you maintain it to adhere to a separation of responsibilities. Then I think everything is better for it from an architectural perspective.

Chris Castle: That's cool. That's interesting, it's like you're kind of in a happy medium it sounds like. In some ways, Grails gives you a lot but, then you also don't say I must get everything from Grails. I can use these other things which are the right tools for the job.

Andrew Garcia: Yeah, yeah. So far so good.

Joe Kutner: Is there anything you feel you're missing?

Andrew Garcia: In Grails?

Joe Kutner: Yeah. Or I guess in your infrastructure as a whole.

Andrew Garcia: You know I think that that answer always depends on the current fire that I'm putting out. I'm always keen to have increased visibility in monitoring. You know something that I do miss is at one point, I was using Splunk to really get nice, crisp visualizations of my logs and I could do some pretty easy click and drag and drop mining of that. Where if you log out and key value pairs, like runtime equals a 100 millisecond you can kind of chart that across the entire, your system, multiple systems. And I guess to Heroku's credit I believe, I would have to assume that there's some plug in there that I haven't wired into yet that will give me that. But really it's just monitoring for capacity planning and system performance is kind of the main issue that I have right now. I don't really have any barriers building features, or technology other than just our hands here building it.

Joe Kutner: So are you saying that you wish you had better insight into the running processes and getting that sort of telemetry out of them?

Andrew Garcia: Yeah. I have a fairly highly concurrent system. Especially with Goodshuffle Pro. You have everyone at a particular business using it simultaneously. Touching the same contacts, projects, etc. So, being able to isolate, you know there can be any number of causes of latency in a particular request. It could be DOM rendering, their browser speed, their machine speed. I mean we recently came across an issue where we realized a large percentage of our users speed issues had to do with, their machines were completely maxed out before they'd even opened our tab. I mean there's internet memes everywhere of just Chrome tabs crunching memory and so we do some screen shares, we'd see they were already at 3.9 gig allocated of a 4 gig machine. And they're like, "Well why is your application slow?" And I'm like it just took you seven seconds to load Googles speed test from a search and so I think we have some other problems to work through here.

Joe Kutner: Yeah.

Andrew Garcia: So you know we talked through, get them to download Great Suspender that sort of thing. But anyway, but back to the… Our response times, you know there can be issues with third party service or just garbage collection on the JVM. And so finding, you know being able to efficiently identify who the offenders are for those garbage collection spikes would be handy. Something that becomes harder and harder to do as the volume and activity on your servers are increasing. A really convenient way to do that would be with a tool like Splunk, that is able to help you really analyze spikes by activity type.

Chris Castle: Yeah, it's pretty interesting. You're running at a pretty interesting scale and really interesting concerns around latency and response time. You're not using the new hotness. How is that possible? You've got these solid mature tech and it seems to do quite well for you.

Andrew Garcia: Yeah, I don't know. I try not to, I really hope that day never comes, where I'm like you know the old man shaking his fist like, you know, you see, all of us as builders have worked for those product managers that just chase shiny balls. You just end up with Frankenstein's systems that just because it's new doesn't mean it's great or it applies to your problem. And also just being a lean team you have to be very selective of the tech that you choose to buy down. As a result, we were very prudent with the features that we try to build. Something that's really fortunate for us is my co-founder is a UI/UX guy. And so we do quite a bit of iteration before we even get to building it and while we're doing that, I'm thinking through the technical implications of that solution. So that when it is created we're not immediately creating a vulnerability in the system from a performance perspective. But sometimes you do.

Andrew Garcia: When we were originally building our dashboards there was a lot of calculations that were happening on server. That's gonna be a problem at scale, and then of course that day comes and comes time to rebuild those and in a data store that's meant to run those calculations in near real time.

Joe Kutner: So speaking of the new hotness, let's talk about JDK11 for a second. So you're on JDK8, correct?

Andrew Garcia: I believe so, yep.

Joe Kutner: And I think some of that is dictated by Grails. I don't think it supports JDK11 yet, although I could be wrong.

Andrew Garcia: Correct.

Joe Kutner: Well I guess I know recently added support for nine and up. So yeah, if that's recent, I would imagine Grails is still a bit behind.

Andrew Garcia: Yeah, it's pluggable. Much like building you're own computer back like I could overclock my CPU but then I got to deal with a bunch of other icky issues. Like-

Andrew Garcia: Yeah. I think there's a new version of GORM that I can try to hack into the current version of Grails that's out there. But I did that, or I attempted it 'cause I was actually trying to the get the web app runner to run locally. 'Cause I was having some issues with Java 6. But anyway, anytime I've attempted to push up, sometimes you'll be pleasantly surprised, but you know I have never really been dissatisfied with the speed of the community, making those improvements on the Grails platform. So I typically just have waited for those to happen given just that I'm so busy with everything else.

Joe Kutner: But there's nothing you're dying to get at in JDK11?

Andrew Garcia: An honest answer would be I just haven't looked into it. I mean if you to tell me that there's some like amazing new garbage collectors that are coming out then I would be like, "Okay, yeah we need to make that happen." We don't really have a garbage collection issue at the moment. But who wouldn't benefit from less time with system halts?

Joe Kutner: Yeah. It sounds like you're more interested in adding value above that line where it really impacts your business, right?

Andrew Garcia: Mm-hmm (affirmative). Correct.

Joe Kutner: Not dealing with some of those-

Andrew Garcia: Is there something amazing that I'm unaware of and I need to go ahead and make this happen?

Chris Castle: It's 10x faster.

Joe Kutner: It fixes a lot of problems.

Joe Kutner: The other thing is I think JDK8 is gonna be here awhile. It's the new JDK6. We're gonna be supporting it for a while. Frameworks will be supporting it and people will be running on it for a long time. I don't feel a huge sense of urgency. Although there is a lot interest for JDK11 broadly in the community. JDK8's definitely here to stay for at least a little longer.

Andrew Garcia: I mean it's usually when I run into an issue that I've become intimately familiar with the minor release versions of a particular package. Through coincidence, I haven't yet run into that yet. It wasn't too long ago before you and I were dealing with some really low level system memory allegation issues. I believe there was a Grails memory leak issue back in the day, and then we were talking about it. Then I recall getting that survey from heroku, like which of these features would you guys love to see? I could only click one time how awesome low level memory application would be for me and so then that actually came out.

Joe Kutner: I think some of that, there's definitely frame actual memory leaks. But some of it was that the JVM being an older technology that predates containers and Docker has always struggled to run well inside of a container. It tries to make intelligent decisions about how much CPU and how much memory and it basically gets those wrong inside of Docker. That's definitely been improved in JDK11, I wouldn't say it's been fixed. And then JDK8 does have some capability to do that better. But neither runtime has found a real solution, so you'll still have those things whatever version you're running on.

Chris Castle: Andrew, were you referring to being able to hookup JMX to get metrics from a process running on Heroku?

Andrew Garcia: Yeah. I'm sure that way still exists and that was the way it had to be done previously. Now through the metrics panel inside Heroku you can get quicker insight into non-heap memory usage that was particularly my issue.

Chris Castle: Yeah. Java specific metrics in there as opposed to kind of process generic.

Andrew Garcia: Right.

Joe Kutner: Tell us how you're using Rabbit MQ for what capabilities. You mentioned Redis for distributed locking?

Andrew Garcia: Correct.

Joe Kutner: And then Elastic for indexing?

Andrew Garcia: Right, right, right. Given the marketplace product you have to support things like full text search, geographic search, that sort of thing is just built for that. Not to mention aggregations and stability to create, generate those types of analytics in the index. Versus pulling that data into your server and computing it there. And Rabbit MQ is just a pretty classic JMS event driven solution. Guaranteed deliverability execution, all those good things. I think it was just I guess fortunately due to my experience or things that I've come across it seemed like obvious solutions that would solve those problems, you know handle the handshake between web and worker dynamic.

Joe Kutner: And so those all have pretty good integration with Spring or Grails?

Andrew Garcia: Oh yeah. I mean that's super well traveled path. The nice thing about Grails is just wrapping around the whole Spring and Hibernate community. It's extremely trivial to find the correct jar, load it through the Gradle and then you're just off to the races.

Chris Castle: It's pretty interesting that you are, the role you're in. You're a founder, you're a pretty sharp tech guy. What kind of job don't you do?

Andrew Garcia: I don't make anything I build look good. Fortunately we have Eric for that. I really don't enjoy any type of CSS pixel manipulation. I'm more than happy to hand all that SASS work over to him. Background as a consultant so part of it was being technical but also being able to talk to the stake holders. That's been really handy when we're getting out in the field and talking to the event rental production companies to understand how their business operates. To really listen to them and hear what they're really looking for. Then being able to map that simultaneously to what the solution will be. I don't know, it's just been quite an adventure. But I think as most builders out there can appreciate it's been really exciting to be in control of how we build it, the technology we use and charting the path along with our customers just is really fulfilling.

Andrew Garcia: I'm sure a lot of us remember being in projects where our managers and etcetera isolated us away from the client. That's not always fun. It's always great to hear the excitement from your customers when they're like, "Oh man, this is amazing!" You're like sweet.

Chris Castle: I guess that means when you have an incident or a problem production that it hits home pretty hard right? You really feel that need to get things working again? It could be stressful at all, or?

Andrew Garcia: Oh yeah. We have a fairly sophisticated alert system where to my detriment, it goes straight to my phone. I think probably a lot of us probably have worn pages in the past etc. I can tell by the rhythm of the vibration in my phone whether or not it's a fatal issue or not. If it's like buzz…buzz..buzz.. I'm like oh, damn. I gotta pull my laptop out wherever bar I'm at or my house and just troubleshoot. Slide down that fire pole and get it resolved.

Chris Castle: Well hopefully how Heroku helps in that regard. I mean I guess the advantages, there are a certain number of things that you aren't worrying about by running on a PaaS.

Andrew Garcia: Yeah, exactly. We have enough problems, it's pretty simple equation when we're dealing with things that adhere in our operations it's like if you create. The technology's creating more problems then its solving then it's gotta go immediately. Not having to deal with anything platform related even, proxy config, SSL handshake stuff I'm just more than happy to not have to deal with.

Chris Castle: Oh we love TLS handshakes. They're like our favorite.

Joe Kutner: Andrew, one thing, someone shared a quote from you, I believe that was, "Sometimes a sign of the best technology is that you don't feel it, or that you're not aware of it."

Andrew Garcia: Mm-hmm (affirmative).

Joe Kutner: Tell me a bit more about what that means to you? Or why that's important to you?

Andrew Garcia: It kind of reminds me of, we can see it in all the decisions we're making. Even if you're creating an alerting system, right? We have conflicts in our system for inventory management to tell a vendor when they've overbooked themselves, right? If you were to constantly fire alarms to a user that there was something wrong. They'll really quickly ignore it, right? It has to be a meaningful issue that's popped up. They can take action with and resolve. It's that signal to noise ration I guess is what I'm getting at, at an overall level. So it's if Heroku was constantly pinging me with some sort of issue that was happening you'll just quickly ignore it. Moving down in that story it's whether it's our users in our SaaS platform or me as consumer of the Heroku platform. Much like our users I have so many other problems to solve that not receiving any sort of alerts through the error channels that I have set up to monitor issues is a good day.

Andrew Garcia: 'Cause it means I can spend my time building features and paying down tech debt. Same thing for our users, they're busy trying to build their business and sell clients. They don't want to hear about, that we had a temporary outage because one of our index providers, their U.S. east region was down. Similarly, I've been on Heroku for years now and I don't think I've had an outage that affected me. That's fantastic. Meanwhile, I won't name names. I recently migrated Elastic search providers and weeks into it there was a five hour outage. On Sunday night at midnight and technically it was only two hours it actually affected us, but still. Having to immediately pull out the laptop as you're getting ready to go to bed to deploy a mitigation strategy is not a good time. And I'm acutely aware that that happened 'cause I remember the exact hour that it happened. They, from a sales perspective should not enjoy that I am noticing them.

Chris Castle: Right.

Andrew Garcia: 'Cause I'm like oh, okay maybe I need- if this keeps happening, I might need to migrate providers.

Joe Kutner: Yeah, I guess a good service provider is kind of like referee. If they're doing their jobs you just don't even notice that they're there. Things just work.

Andrew Garcia: Yeah.

Andrew Garcia: Exactly.

Chris Castle: So you did mention a bit about your tech stack? ElasticSearch, Redis and what you use some other things? I don't know if that was the complete overview. If you could give us like a "This is what my tech stack looks like, these are the services I use and this is why I use them" kind of for all the main pieces, at least what you'd be comfortable sharing, that'd be great.

Andrew Garcia: Yeah. I certainty haven't disclosed anything that we wouldn't include on a job description. Which obviously we would presume our competitors would have if they cared to check.

Chris Castle: Yup.

Andrew Garcia: I wouldn't say we're doing anything exotic with respect to the third party services that we're connecting to.

Chris Castle: Mm-hmm (affirmative).

Andrew Garcia: A lot of the trickery that I'm doing for speed is more something that we're running in house. From my previous big day to days, so I won't get into those particulars. Stripe for payment processing. You know we use Amazon for cloud storage.

Chris Castle: Yup.

Andrew Garcia: The Blitline line plugin that you guys offer through Heroku for image manipulation.

Chris Castle: Mm-hmm (affirmative).

Andrew Garcia: It's just as we've been moving along, it's fairly obvious which jobs don't belong running on your web dynos. And image manipulation is certain in one of those. Papertrail pretty straight forward things.

Chris Castle: What about data persistence? Like Postgres or MySQL?

Andrew Garcia: Yeah Postgres, yeah we're running the Heroku Postgres there too. That's been good, no outages or any issues there. No performance issues that's great. And then the Redis plugin rather through Heroku as well.

Chris Castle: I don't mean boring in a bad way. But like boring tech or simple just keep your tech stack boring or simple is better than going after the new hotness all the time.

Andrew Garcia: Yeah, yeah. I'm still using, I think you mentioned memcache, which I am still using that for the distributive session management but I really shouldn't move that over to Redis just so I can solidify the- it's a little reductant to be running both of those.

Chris Castle: Well, thanks for joining us for this podcast. Thank you Andrew and Joe for interesting conversation. And we'll talk to you guys all later.

Joe Kutner: Thanks Chris.

Andrew Garcia: Sounds great. Great to meet you guys. Talk soon.

About Code[ish]

A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.

Subscribe to Code[ish]

Hosted By:

Chris Castle

Director, Developer Advocacy, Heroku
@crc

Joe Kutner

Software Architect, Heroku
@codefinger

with Guest:

Andrew Garcia

Co-Founder, Goodshuffle