51. Best Practices in Error Handling
Hosted by Julián Duque, with guest Ruben Bridgewater.
Julián Duque is a senior developer advocate here at Heroku. He attended the NodeConf EU conference in Ireland, and met up with Ruben Bridgewater, a software architect and core Node.js contributor. Julián and Ruben go over the history of Node.js (now in its tenth year), as well as how Ruben became involved with the Node.js project.
Ruben has several suggestions. First, he advises everyone to switch to using the async/await pattern of asynchronous code execution, which was introduced in Node 12. This allows errors to provide an async stack trace, which is more helpful in diagnosing errors than the obfuscated errors that come from promises. Second, he advises teams not to simply try-catch and rethrow errors. It's far more beneficial to abstract errors into individual classes (
NotPermittedError, etc), because it allows the programmer to identify immediately from the error name what went wrong.
From this abstraction, Julián and Ruben discuss its role in logging strategies. By having your errors defined as distinct classes, you can place all sorts of "generic" information in them as properties, such as the status code. With no additional programming labor, this data can be exposed in the logs for additional analysis.
Mostly, the best thing you can do is to think about errors while writing the code. For example, add tests for all the edge cases you may encounter. By investing more time upfront in the development process, you will save yourself from worries later on when the code hits production.
Links from this episode
Julián: Welcome to Code[ish]. My name is Julián Duque. I'm a senior developer advocate here at Heroku and today we are recording from NodeConf EU the main Node.js conference that happens in Europe. So we are here at the beautiful city of Kilkenny in Ireland, and with me I have the pleasure to be with the Ruben Bridgewater. Ruben is a software architect and works as a Node.js consultant and he's also part of the Node.js project. So Ruben, tell us a little bit more about what you are doing here at the conference, what you are talking about or teaching about here at NodeConf EU.
Ruben: So in this particular case I gave a workshop about error handling best practices and what patterns you can use to make your life easier, debugging your application later.
Julián: We are going to be talking a little bit more about your workshop in that content later. I'm just curious about what got you into Node core. What brought you into being like involved pretty much almost full time working for the Node.js project.
Julián: Okay. Yes, the Node.js project has been evolving for a while and pretty much we are celebrating the 10 years of Node.js. It's a project that had started in 2009 and the community have seen like a great transformation around the project. When do you start like contributing to core? It was before the io.js fork or after it?
Ruben: It was after.
Julián: Oh yeah. Nice. Yeah, I started contributing to the Node project also after io.js because I found that it was like more open and it was more welcoming to external contributors. So that was like one of the good things that happened to the project. It was like kind of a little bit of stale and not getting a lot of innovation and evolution. But after, after the fork and being able to join the Node.js foundation, a lot of other people caught a lot of interest and I started contributing to Node. So you found it like easy to get into the community or you found like some sort of barriers when you try to start contributing. How was that experience for you?
Ruben: For me it felt very open, so I got to know a couple of Node.js collaborators and also people from the technical steering committee of Node.js on a conference and, it was very good to talk to them and they try to get me involved in the project as well. When they noticed that I was already in contributing to other projects and so on, it was very straight forward to just open a pull request. You get reviews and there was no problem of interaction pretty much. So for me it was very open.
Julián: Right now, what are you working on in the Node.js project? What are your main areas of work right now? There is no
Ruben: There is no easy answer to that question because I work on multiple areas of Node core and sometimes it's just what ever pops up and that I believe it should be worked on. I can give a couple of examples that I mainly maintain. For example, the Node core internal error system is one that I mainly wrote or pretty much everyone who used console.log is using my code because I am the main maintainer of util.inspect and util.inspect is used internally used for console lock. I'm also the main maintainer of the assert module. The reason for that is that not a lot of people wanted to maintain those modules originally, but it felt not so ideal that their functionality felt broken and no functionality in Node core should be broken. So I started contributing to parts that and not so many people wanted to work on.
Julián: Now everything makes sense. So that's why you are like very interested in ever handling best practices because you are involved in the error handling and error reporting part of Node. Is that correct?
Julián: And how has been your experience as a software architect and consultant given you insights and information about how other companies are running Node? So it has given you more tools to be able to contribute back to the project and improve those areas or how has been that experience by working on the field?
Ruben: One thing that you will definitely realize is companies run into a couple of problems more frequently and when you see that, and that's an area that I would like to improve. For example, it is a reason why I give talks about error handling. I've seen errors and being error handling being spread out through projects all over without a concrete pattern and often you'd lost information. It did not work as expected and, it's really simple to improve a lot of that by sticking to a couple of best practices.
Julián: For me, one of the most difficult parts while I was doing also consulting as a Node.js developer, is that you have different ways of doing asynchronous programming. So you have callbacks, you have promises, and now you have async/await and every one of those different patterns have a different way of doing error handling and it have like all the different things that are not going to be easy for a team to be able to manage all of the different edge cases. So what type of best practices or recommendations can you give to the people that are, that are listening to you and are having the same challenges of dealing with errors in Node.js?
Ruben: So with the current version of Node core. There is one very straightforward recommendation that everyone should do is using async/await as much as possible because there you will get async stack traces and that's the only way to get async stack traces for no extra cost. So it's really, really good. It makes your code more readable and it's so simple to use that would be number one rule.
Julián: So async/await so we can get like async stack traces. What version of Node.js is starting to support async stack traces
Ruben: That's coming from Node.js version 12 on.
Julián: Oh Node.js version 12 which is a LTS version right now. So if you upgrade your Node.js projects from 10 to 12 you are going to get the benefits of having more visibility around the stack traces because of a single stack trace. What other recommendation on best practice do you have to give?
Ruben: Normally people are wrapping a lot of code in a try-catch and rethrowing errors, and this is normally not the best way to do it. Instead just think about what your application is built off. Like you normally have different layers in your application. One layer is for example incoming requests, for example, a REST API, maybe a GraphQL API and you want to validate all the incoming data if it's doing the right thing or not. If it's a valid data, if the person is allowed to access some data and you should throw the errors wherever something went wrong or where you validated and he should then handle that error in one single function for that layer. That's in the incoming requests. You can do something similar with outgoing requests or when you are having a remote procedure call, you are calling a different API. You want to make sure that you handle the errors only in that one spot and not like spread out in different parts of the application. You want to handle all database related errors only in one spot for the database or for the caching layer and that reduces the surface where it can do something wrong.
Julián: And what to do for example, with HTTP errors on a regular API? If we are writing like no Node.js servers and we need to properly return like HTTP errors and status codes, what can we do to improve that situation they there are like patterns that we can implement here.
Ruben: When you throw such an error, it's very expressive. Everyone who would read through a new and not found error would understand what is going on in your code and you would not have to worry about adding the property for the status code anymore because that's part of the error that you already throwing because it's part of the class. So when you later on check in the error handler, the abstraction where you now want to send back the information that something went wrong to the user, you will just check if it is an instance of a user facing error because we all inherit from them and we only need a single check. So we know, okay, those are all of the right type and we access the correct property which would be the status code and that's what we send back automatically. It's all there. That's a nice abstraction. Makes the code very simple, very small and understandable.
Julián: And also if you have like a very good logging strategy in place, you are going to get the specifically that type of error and exception that happened in your application instead of the regular error object. Which a lot of people just use this. They say like throw new Error because it's the easiest way, but it is not giving you a lot of context and information just by giving an error message. Maybe he's not the best, the best thing to do. Is there any other recommendation that you you are giving you workshop or do you think that we cover pretty much the most important ones?
Ruben: Of course there's more for example, using utility functions like util.promisify in general working with promises is much, much more difficult than a lot of people think. Also when using async/await. There are a couple of pitfalls that are not obvious and those are partially coming from the spec itself. It's problematic to work around them. It is possible in multiple cases though. So what do we not want to do? We do not want to promisify callback based API on our own. Instead we should use something like util.promisify a core Node.js functionality that you can use to do that, and because building a new constructor like using new promise is very difficult to do right, and sometimes you might have some code that would end up not being noticed later on to be faulty because there is a dead zone for the code execution in the promise constructor. It's a little bit difficult to explain it in detail without showing some code.
Julián: Yeah, but definitely we are going to be sharing some code and we're going to be sharing a Ruben's workshop in the resources. We always put at the end of the episode, so I would recommend you to go and take a look and play with all of the different examples that are in the workshop. And we will also be sharing the previous presentations that Ruben have given in other conferences about this specific topic because it's very important and especially if you are running Node.js applications. Error handling is one of the most challenging parts in Node and if you do it right you are going to be having more peace when you are running your application and production.
Ruben: It's not the only utility function. So another downside with promises is that it will keep your code or your application running even in case of an error. Let's imagine you had a regular callback based API before and you reflected it to async/await. If there would have been the program error before, it would have thrown an uncaught exception but due to using async/await, it would now end up as an unhandled rejection and an unhandled rejection is not going to crash the application by default. It is also only there on the next tick to be detected. So it's an asynchronous operation in general and we do not want to continue running the application in those cases either, at least when you're on the server side, the front end is a little bit different. Why do we not want to continue running the application is that you might have a memory leak, a broken state and this could end up in a really bad situation later on.
Ruben: So mostly we are having a cluster where you, for example, use Kubernetes and your service when it's crashing is going to be replaced. So everything's going to be reset to the defaults and you don't have to worry about broken state. You don't have to worry about memory leaks anymore. So what can you do to actually solve that? It's very simple at the moment, Node core since Node 10 at a moment which is only supported in Node 10.17 that's the latest version at the moment. And that support the flag that is called dash dash unhandled dash rejections and there's three different modes you can choose and I really recommend to use strict mode, which will then end up crashing the application even in case of an unhandled rejection.
Julián: And if you don't have like a no handled rejection, a handler in the process module by checking that event. I'm also be going to share in the description of the podcast presentation that I gave here at NodeConf EU. Which was like pretty much talking about that specific part of error handling when the process is shutting down, when there is a no call exception. What to do and definitely we agree and that's a recommendation in the community. It's better to restart the process and start something fresh then trying to recover from our programing error because it can, it can end the application in a very bad state. Did you have anything to add or any invitation or recommendation to the people that are listening to us?
Ruben: Mostly try to think about errors while building in the code or while writing the API in the first place. Also always make sure to test all edge cases of the API, and when writing your tests and check for error cases. That's often not done, and it's one of the reasons why a lot of code in production receives errors that are otherwise not there. That could have been caught during development already. So just try to ease your life by investing a tiny bit of more time upfront.
Julián: Oh yeah, that's a very, very good recommendation. Well Ruben, thank you very much for your time and for your knowledge. It's a very, very good and I hope a lot of our listeners that are working with Node.js are going to get a lot of information from this talk. And even if they're not working with Node pretty much, most of the recommendations around error handling are going to apply to other platforms like having a good testing, a strategy logging, a strategy being able to work on an error driven oriented API to have like proper errors. So it's something that is good to be applying for other technologies. So well, this is a the error handling episode with Ruben Bridgewater, and let's see you on the next one. Thank you very much.
Ruben: Thank you as well.
A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.
← Previous episode
50. High Energy, Low Power: A Bluetooth Christmas Story
Next episode →
52. Building and Scaling a Heroku Add-on
January 21st, 53. Scaling Telecommunications Data with a Service Mesh
Developer Advocate, Heroku
Senior Software Architect, Freelancer
More episodes from Code[ish]
Adam McCrea and Corey Martin
Heroku applications big and small run on dynos, virtualized Linux containers fine-tuned to execute your code. As the load on a server increases, you must add dynos to keep up with demand—but how do you know how many more to add? And how can... →
Chris Castle and Charlie Gleason
Chris Castle has a two year nephew who, like most two year olds, likes pushing buttons—especially ones that turn lights on. When a Christmas tree appeared a few weeks ago, and lights were put up, he was very excited. At the same time, Chris... →
Juan Pablo Buriticá and Anthony Mazzarella
Running a start-up is hard. Running a start-up with teammates spread across the world is even harder. Juan Pablo Buriticá is the VP of engineering at Splice. He believes there's a fallacy that remote teams ought to be treated differently... →