Looking for more podcasts? Tune in to the Salesforce Developer podcast to hear short and insightful stories for developers, from developers.
97. The Challenges of Bespoke Solutions in a Regulated World
Hosted by Greg Nokes, with guests James Maidment and Ammar Akhtar.
Not every tech company gets to move fast and break things. For companies operating in heavily regulated spaces, like banking, efforts to modernize legacy systems must be made carefully. Yobota explains how they're able to deliver custom APIs and solutions to financial institutions with guaranteed uptime and functionality.
Greg Nokes, a Master Technical Architect with Heroku, interviews two members of Yobota, a banking systems provider: Ammar Akhtar, its CEO and co-founder, and James Maidment, the head of Technical Operations. The financial industry is heavily regulated. As it stands, it was only until about 2016 that the UK (where Yobota is based) gave favorable guidance for vendors to operate in the cloud. As a service provider, the banks that use Yobota are audited by the Financial Conduct Authority. As part of that audit, every single deployment performed over a year is examined. Regulators select a random set of them, and Yobota has to demonstrate that they know who was involved in the release, and precisely which services were affected. Thus, their entire shipping process is revolved around meeting this regulation goals. They're an integral part of the company, just as data security and uptime availability are.
The platform is designed in such a way to both evolve quickly and quickly perform safe deployments that are observable. Unlike other startups, Yobota has decided to invest in a sysadmin team, in order to split the organization between people who develop features and people who manage their compliance. For example, as the company grows, they've found that active hands-on management of permissions has been a valuable investment. Different groups need access to staging environments versus production environments; and, with over 300 apps on multiple dynos, access to resources needs to be carefully configured.
This is seemingly slow shipping process is advantageous for two reasons. First, meeting compliance is the law, and flirting around that has tremendous consequences. But second, and more importantly, Yobota also provides fake environments for their engineers to develop around. They're able to give developers the ability to experiment with their platform in a safe way; should they choose to advance a feature into a production environment, a different team is able to address what needs to be done to meet the needs of that regulated environment. James suggests to other companies working in these sorts of industries to consider compliance integral to the way their systems operates, and to think about concerns upfront, in advance of working on any feature.
Links from this episode
- Yobota is a core banking platform that allows financial institutions to launch innovative products in a fast and reliable way
Greg: Hi. This is Greg Nokes, Master Technical Architect with Heroku. Today I'm going to be talking with Ammar and James and we're going to be talking about some of the difficulties of delivering bespoke solutions in a regulated world. So Ammar and James if you could give a little bit about yourselves.
Ammar: I'm Ammar Akhtar and the CEO and one of the founders at Yobota and we are a core banking systems vendor. We've been going for around four and a half years based out of London and having that time built a completely new core banking engine ultimately, which lets clients of ours set up financial products, distribute them through online channels, ultimately onboard customers and run pricing calculations and life cycling routines on them.
James: My name's James Maidment. I'm Head of Technical Operations at Yobota. I joined about three years ago, just a week before we launched live with our first client. And I'm focused basically on building the teams and processes that mean that we're live in production without issues or downtime.
Greg: As running a banking platform. Are you doing complete end-to-end banking? From customer acquisition through running the person's checking account, letting them log in and looking at it?
Greg: In the US, that would be a highly regulated industry. I'm sure it is worldwide as well.
Greg: How does that impact your design decisions? How does that impact your technology decisions?
Ammar: The timing of how we've done this has been pretty fortunate in certain senses particularly in the UK and then Europe ultimately from 2015, 2016 onwards where the regulators have given some pretty favorable guidance on the use of cloud technologies and the use of outsource vendors in that space. And because of that, when we set up and we were a very small team for the first couple of years including when James joined us, we were looking at how we can safely build a platform that's got very decent control of the underlying infrastructure that the code itself runs on, but also know what changed when it's changed.
Ammar: Have good integrations into other controlling technologies that can help us automate various aspects of our overall release processes and build processes and so on and then Heroku was an obvious candidate for that just based on, it gave us a lot of flexibility. Where that then leads us is kind of being able to illustrate to any of our clients at any point in time that yeah, this stuff is still fit for purpose and we're still using it and essentially delivering our service in the way that we said we would despite all the changes in the underlying software that get developed and shipped every week.
James: The critical part of that is around the audit side of being a compliant industry. As a service provider, the banks that use us as a service, they're audited by the Financial Conduct Authority in the UK and as part of the audit that they have to go through, we go into that audit as a service provider. We need to do things like, for instance, look at every single release we've done over the last year. They will pick up a random selection of those and we then have to demonstrate that we know who performed that release, who approved that release, whether the release worked or not. And so having the platform features to be able to look back across the estate and go, "I can see what happened when without having to have a large amount of internal paperwork is very valuable." Particularly in SME, we don't have a team of 30 people sitting there generating the plant's outputs. We're a very small, quite agile team and work through much smaller when we launched. So that's a critical part for the sort of platform side.
Ammar: A lot of that also links into kind of availability and uptime as well because you have to guarantee that system will be available. Accounts will be get to life cycles in a timely fashion. You're not going to have sort of payments being registered sort of days later or anything like that. So the whole set of choices that we made were cognizant of those sorts of factors as well, that this has to be a modern, always up in real time sort of banking engine. That's also actually going to work as a managed service that sort of gets provided to many, many, many regulated clients in diverse regions. It's a huge project in that sense.
Greg: Yeah, I imagine it is. And does having the Heroku platform at a stations, a security help with that? Is it something that you then you can bring in as you're brought in as a service provider, do you bring in your service provider and so on and so forth down the chain?
Ammar: Yeah. I mean, it's, there's a few different dimensions in that. Cause there are things like GDPR, data regulations and data privacy regulations in that have come in recently where you have to sort of give a level of transparency into supply chain that you provide into your clients and everyone is connected that way. You also have to essentially be able to demonstrate that the vendors you're choosing are the right vendors and are going to be able to essentially ensure that everyone remains compliant and everyone remains on the right side of the regulations.
Ammar: And then with financial services being what it is, it's kind of the other element of it, which is, there are so many different providers for specific little things in the overall value chain like payment processing or ID checking or credit decisioning and all the sort of good stuff that you have to do when you're opening account or maintaining count on life cycling. And as you over the time it's open. So you have to do sort of similar due diligence as you're going through there. And the infrastructure choices that those companies make and those teams make not necessarily going to be the same as yours. So being able to have something in the middle that's stable and that's reliable is really important as well.
Greg: I'm sure that the platform gives you some level of agility, some level to evolve your product in response to customer and customer of customers' requests as well.
James: That's sort of one of the critical things we found about using the platform is, don't treat it as effectively a data center in the cloud using the facilities like the APIs are available to be able to do sort of fast, reliable deployments. It's critical to gain value out of it. One of the advantages we had was, we spent a long time as a company without having a dedicated central administration team, because we could use sort of the admin sites, the APIs to manage the platform just between the sort of development support teams. It avoided us, having people who were just sitting there doing things like, upgrading VMs, which is a huge time to give for a small company,
Greg: As somebody who used to upgrade VMs and used to upgrade servers. I appreciate that comment because that was drudgery, but important drudgery back then. So the platform gives you the ability to evolve quickly and the platform gives you the agility to do safe deployments that are observable and are discoverable. Do you talk to your customers or do you get all your requirements from direct banks that you work with? How do you put features into your pipeline?
Ammar: We're a B2B, we have very limited interaction with our clients and customers. A lot of what we're doing is trying to understand what it is our clients are doing as businesses and making sure that there are clean APIs that allow them to do that, or have a set of features that will essentially match up with their future roadmaps or kind of give them guidance on when they would be able to do certain things. And that's sort of very much kind of the outward looking side of it. You've got a business that comes along and they want to do a particularly kind of a savings account or they want to run a particular kind of a mortgage. And we either already support something like that, or we don't, and there's configure build that has to be done.
Ammar: And we look at it and work with them as clients. Then there's kind of the other side of it, which James is talking about in terms of the actual sort of controlled change, the spirit of the regulation in the UK, certainly. And that's pretty consistent everywhere, right? And if you can sort of, or it's pretty consistent for all businesses rather, and if you can satisfy for one, you just need to be able to replicate what you're doing for other clients.
Ammar: And there's a lot of custom technology that we've built in order to be able to support that as our business grows and as our, as a granularity of information that we do, we think is sensible to store and share grows. So it's sort of those two things, but it's, there's definitely kind of an element of, "it ain't broke, don't fix it." Just know what you've got to do and do that well as a sort of bare minimum requirement, not do the bare minimum.
Greg: To change a little bit, just to go back. You mentioned something about how in the early days, you didn't have to have sysadmins that kind of alludes to perhaps now you do have some, and can you talk about if you do, what are they doing? Are they supporting the Heroku platform? Do you have kind of a multi-cloud build out now? What's going on with that?
James: So really one of the major reasons we brought sysadmins in was around more of a sense of separation of concerns. One of the critical bits about the compliance side is being able to split the people who develop from the people who manage. And as we grow as a company, there's a lot more management of things like permissions. You want to make sure that everyone in dev has access to everything up to the UAT set of environments, maybe have something to get access to a pre-production environment, doesn't have access to a prod environment. If people move within the businesses, those needs to be managed. And again, it's not something you really want to be having to do by hand. So really we've been sort of taking the systems management side of things and giving that to a specific team because we're certainly large enough at this point that that's becoming a full-time job.
Ammar: There are certain things where we're, and we're also sort of doing certain things directly in AWS rather than at the Heroku layer, which kind of proxies down on AWS as well, which as we essentially have gone in to having multiple clients running and a great many more test environments and so forth, that just becomes its own beast that needs managing, particularly as we're trying to do cleverer things with stuff like Terraform and sort of instructors code technologies like that.
James: And we're going to use quite a large number of environments at the moment. We're probably looking right now at over 300 apps with multiple dynos within those. And so that's just quite a lot of stuff to manage at this point.
Greg: Yeah. That's a pretty big deployment. So do you use Terraform to manage your Heroku environments? So you can kind of take a step back from that or pipelines or how do you manage 300 apps?
Ammar: Well, they sort of split across lots of different things, so we have distinct environments that are going to any one environment is probably no more than a dozen or a couple of dozen different Heroku applications. So those are sort of managed pipelines and the rest of it is then lots of different developer environments, test environments, bit of some dev infrastructure as well that people use. And that's when Terraform comes into it and has an increasingly. So as we sort of popped a few things up and down, it's quite varied. I think one of the things is we're not particularly stuck to one way of doing things as we've gone, particularly in our, the further away you get from that sort of controlled production estates. And there's a lot more experimentation and openness in terms of what's the best way of solving a particular problem in going forward.
James: And that's something that we give development sort of direct access to do. So if they want to spin up an app and try something out, that as long as they're outside of the regulated environment side of things, we're perfectly happy for them to go in and to go do that. It doesn't have to come to the central DevOps team.
Greg: That's really cool. So you're giving the developers the ability to experiment in safe and sane ways with guide rails, but then also leveraging those same sort of guide rails in your production environments with a different team to go ahead and manage the production environments because of the separation of concerns required by the regulator environment you're working in.
Greg: So how do you think about resilience and how do you think about building an app on top of Heroku that is highly resilient, highly available? The Heroku platform itself is pretty darn good for that with all the self-healing stuff built into it, but have you built an additional layer on top of that to go ahead and help with that? Or what's your thought process around that?
James: So we have some tools in place to do things like load management. So the volumes coming through on the platform and sort of bringing them up, there's certainly been a situation where we're looking at basically no single points of failure. So everything's running at least sort of two systems, so we can do agreeing. One of the big areas we've been doing on top of that is the sort monitoring side of things so that we know when we're either not seeing processes running as expected, or we are seeing more work than expected. Getting that out to the support team is being hugely important. So we have a lot of metrics coming out of the system and a lot of alerting on top of that in order to jump on things.
James: And part of that has been pulling stuff from things like the Heroku APIs to look at sort of base metrics, but quite a lot of it has been metrics that come out of our platform itself. So how many payments for processing a second or are bad processes running as mentioned earlier we're 24/7. So if it goes wrong at 1:00 AM and we're dealing with this sort of an end of day process for a banking system, that's actually a critical time between sort of midnight and 4:00 AM. It's not a quiet time in any way whatsoever. So you need to know if something's gone wrong at that point.
Greg: Yeah. It reminds me of my early days, billing at AWS infrastructure. So many people ignored metrics that ignored feeding information back out into some sort of observability system.
Ammar: Yeah, and I think there's kind of also discovering what those metrics need to be. We've been on a real journey on that. And from an engineering standpoint, you think about things in a certain way, and that's why you sort of push that and the system runs in a particular way, and then as it comes on the sort of more strain or talk, just because of the real world hitting it, you suddenly started realizing, you don't have any sort of leading indicators and you've got to start figuring out how to get those out and how to analyze them and in a way that they can be useful.
Ammar: And that's definitely been an iterative process that we've gone through. I think one of the other things we're finding is, as we have different clients who have actually different businesses and different usage patterns and different load patterns on our status, we don't always need the same alerting and we don't always need the same sort of view of the metrics. So kind of again, understanding what the superset is, what can be cherry-picked, it's a scientific art in many ways. I guess.
James: There's also an element there of, as Ammar mentioned earlier, because we're often in, we're integrating with a large number of like other third parties for things like payments, being able to know when the people downstream of us have problems is very important. Some cases, it'll be a third party that actually managed by the client, but we will be talking to them and saying, "This other third party is having problems rather than send coming to us." So being proactive on that and just having a large number of them where we're monitoring a lot of stuff, not necessarily with an ability to fix it.
Greg: Going from metrics, it seems fairly easy to move into performance. How do you measure performance and how do you tune for performance? How do you look at growing the performance of your application? Do you look at that as more requests per second or faster requests per second or both or what?
James: So I guess there's two very distinct sides to that. And particularly with the banking application side of things, there's the performance of, for an individual customer, they're coming to the site, they're clicking through various forms, they're making their request of exact credit checks and each of those individual steps, you want to be as fast as possible so that the customers get a good experience. You're looking then at things like response times, that is quite sort of individual. But then on the other hand, we have things like payment processing, which tends to be very spiky for instance, end of month is often a big time for that. And so you might be looking at 10 to 50X, your normal load for those specific areas, sort of been in two areas there. One of which is pretty much sort of customer experience testing. And the other one is more bulk batch load testing.
Ammar: Yeah. There's sort of other, other angles on that on batch jobs as well, where if you're running a portfolio wide calculation and you're trying to reduce the actual domain of information that you're pushing so that a calculation can be formed in a kind of reductive way. And you haven't quite figured out where the limits are for that, but you hit those things from time to time as well. And you've got to look at how do you actually make sure that that job can be done as quickly as possible because it's, not because it is time sensitive, but actually because there's a queue of other things that are sitting behind it, that will just never happen because this thing can't run right now.
Ammar: And that particularly when, you build parts of the system that have been running fine for like years, right? And then you scale and scale and scale and suddenly like, realize that are actually there's something now that needs to be revisited, because one of our main sort of operating assumptions has changed. It's one of those quite challenging aspects of everything, particularly on large distributed systems like this.
Greg: Yeah. I've worked building backend jobs and I've worked building frontend applications. Sounds like you guys have the best of both worlds.
Ammar: We have the best of most worlds. We've also had some sort of custom patents that we've had to come up with as well, where we have our sort of main operational ledger database switch very early on. We had to sort of take the decision of actually managing that directly in AWS rather than running through sort of Heroku Postgres, just cause there were, there was some conversations we had with the architects. And there was some limitation on the write ahead logs, which meant that, in once in a blue moon sort of thing, there might be a chance that we could have a window of sort of transactional, like a loss of transactional integrity in the way that we were needing to use database. And so, we kind of had to sort of work on a number of things where we had an application tier within the Heroku state, a resilient data tier sitting elsewhere ,in AWS direct, figure out what the right buildpacks were so that we could get dynos running correctly with the appropriate tunneling in place and all that sort of stuff.
Ammar: That sort of all adds to the whole sort of question on resilience. Cause you've got to be able to make sure that you never sort of lose the connection to your database either, right? As you're going through that whole pattern. And we ended up having to do something that was just a little non-standard and then that took us down the path that we've since been on. And from that operational ledger, then we sort of published information that sort of becomes work for other services to do. And there are some pretty neat resilient patterns that we built, which means that actually we're always, well, it's very difficult, effectively impossible, for work to get sort of repeated or done out of sequence in how that stuff's architected, which is pretty neat given that it's, we're essentially running on two separate clouds within what we do as well, but actually that sort of full integrity there. And it's really sort of vast design exercise to make sure that that was done correctly and safely.
Greg: So in the current times to shift gears just a little bit with everything that's going on in the world, how has something like COVID-19 impacted your business?
Ammar: You know, we're, we're sort of a cloud-based business in that sense. We don't really own anything. People have got laptops and monitors and stuff, but that's sort of it. So in a sense it was, most of our engineering team were really, really happy working from home when, when COVID sort of kicked in and as this whole lockdown scenario kind of progressed, it just sort of became a normal way of working. So in that sense, we've been okay. And I think lots of technical product companies like ourselves are probably in a similar boat. I think where we see a difference is our production volumes are obviously different because demands and consumer finance are different. Now lender's attitudes to risk are different as well. And that all sort of clearly, it all adds up. There's also just a difference in, essentially what use cases become higher demand in this time, which is simply that, there's a lot more around sort of treating customers fairly and credit forbearance and things like that.
Ammar: It's been quite an exciting year in that sense and, but I think a lot of our sort of control processes and the general policies that we have, they're things that we've been able to sort of continue to demonstrate adherence to, which is really, really important from a banking controls and banking regulation perspective because of the way those things are set up, despite people being at home rather than being in an office. And that I think has been one of the highlights of the year in the sense that the majority of what we've created, not just in terms of software, but the overall sort of technical architecture, everything we've done.
Greg: So yeah, it sounds like that, well, no one could prepare for this, that the platform you built and the systems you built were resilient enough and were agile enough. So you could land on the ground, on your feet and running even to go ahead and make the changes you need to change to respond to this sort of thing.
Ammar: Yeah, yeah. Completely. And it's just being able to sort of deal with the unkind things. The real world throws at to you, which actually lots of, we're essentially an infrastructure provider in many ways. Core banking is infrastructure for banking, businesses and lending businesses and the sort of transaction processing we do in the position keeping that we do is not something you can kind of just turn on and off, right. Because it's no longer convenient or it's no longer sort of the direction that a business wants to head. Once you're sort of in, you're in. You have to be able to sort of maintain everything that's out there whilst adding onto it in a way that's just very sort of aware of the reality that you're in, in 2020 or beyond, right? And last year was very different compared to this year and basically every possible way. So it was quite a, I don't think anyone really got settled, or comfortable into just a nice sort of working pattern.
Ammar: So things changed.
Greg: So do you have any advice for anyone, looking to build a regulated application or custom solutions with custom requirements, anybody sort of undertaking a similar journey to yours or just building anything in general?
Ammar: I think if you are in a B2B space, like we are in, an enterprisey B2B space, it's really important to make sure you're clear on the problem that you're solving. And as early as you can having some sort of early stage client or marquee client that you can say, "Look, and these guys are using it. We've built this with solving this problem, and these guys are using it, or these guys are going to use it." And that's not always easy to do because it's companies, particularly regulated companies, are risk averse when it comes to new products and new technologies, but that's kind of the first step. Everything else is sort of, if you're good at solving the sorts of problems that you're setting out to solve, then you will have a very, I think a really solid chance of succeeding at it. But I don't think there's like a magic recipe for it. I think there's, you are just going to get knocked around quite a lot in the process. We certainly have, right James?
James: Very much so. I mean, that's one of the critical things that's come up for me is particularly on the regulated side where you are looking at things like compliance, being a critical part of the system, don't try and bolt it on afterwards. Don't sort of build your, here's a lovely way of like taking out a loan or opening an account and then go, "Oh, and now we need to work out how to prove that that count was a valid account." You need to have it right in your data model, right at the start so that it all bubbles up through the system rather than layering on top in sort of much the same way as like security needs to be, to have a mindset integral to the system and compliance works in the same way. You will find it very hard to do it after the fact, if you haven't thought about it upfront.
Greg: That's a great point. That's a fantastic point. And I guess that goes back to that truism of know what problem you're solving. And it sounds like you might be solving for multiple problems. One would be, how to provide this product to your customers, but the other one is how to do so in a secure and compliant manner. And having that in your head from the beginning while you're designing will help you deliver that product more efficiently over time.
Ammar: Yeah, absolutely. I mean, your ultimate responsibility is around making sure your clients are successful, right? So if you're not able to provide them with something that sort of meets the full set of, or their base minimum needs, and you're not actually going to set them up for success and your success as a vendor or as any company is entirely linked to your client's success and your customer success. So you've got to have that whole sort of picture in mind. I remember, now I sort of in our kind of early days when there was the other James, who's my co-founder, James King. A lot of it was sort of software prototyping, right. And thinking, right, okay, we can functionally do this, we can functionally do that. We can write an interest rate calculator that'll allow us to reshape accounts, lots of different ways or whatever it was.
Ammar: And we often had to just say, "Okay, and now let's just stop, eyes on the prize. What are we actually trying to solve for? Let's just set it down." And it's a hard discipline to have. And it's doubly hard when actually you see lots of different market opportunities as well, and say, "Okay, as a product, as a platform, we could very easily solve for this particular sort of payments pattern that people are starting to use more and more. We could easily provide an interface that lets companies lend to that market segment in lot easier way." And you just can't do all of that all at the same time, because you might not be at that point of being good enough on everything else with the clients you're trying to serve. And I think that's the sort of key discipline that you need to have.
Greg: Well, thank you so much for your time today, Ammar and James, and it was wonderful talking to you and I hope we get to chat again.
Ammar: Great. Thanks very much, Greg.
James: It's great to be here. Thanks Greg.
A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.
Master Technical Architect, Heroku
Greg is a lifelong technologist, learner and geek. He has worked at Heroku for over 8 years.
Head of Technical Operations, Yobota
James has been creating, managing and advising teams in regulated industries for 20 years. He likes the outside.
Founder & CEO, Yobota
Ammar has worked on tech & business teams in tier one banks for 10+ years. His experience spans development, product creation, leadership & mentoring.
More episodes from Code[ish]
Laura Fletcher, Wesley Beary, and Ian Varley
In this episode, Ian, Laura, and Wesley talk about the importance of communication skills, specifically writing, for people in technical roles. Ian calls writing the single most important meta skill you can have. And the good news is that... →
Jim Jagielski and Alyssa Arvin
Jim Jagielski is the newest member of Salesforce’s Open Source Program Office, but he’s no newbie to open source. In this episode, he talks with Alyssa Arvin, Senior Program Manager for Open Source about his early explorations into open... →
Lisa Marshall and Greg Nokes
This episode of Codeish includes Greg Nokes, distinguished technical architect with Salesforce Heroku, and Lisa Marshall, Senior Vice President of TMP Innovation & Learning at Salesforce. Lisa manages a team within technology and product... →