47. Working with an Event-Driven Architecture
Hosted by Robert Blumen, with guest Alexey Syomichev.
One factor to consider when designing an application is how to represent information about events that occur. One traditional way is to use a database: as new events occur, a row in a table is updated. Another approach is to use an event log: every event that occurs is retained in a serialized format, so that app's state at any point in time is preserved.
On this episode Alexey Syomichev talks with host Robert Blumen about enterprise architecture organized around an immutable event log. The discussion covers the notion of events - what are they, what do they contain, event data, event schema, event formats; producers and consumers of events; the event log; time stamps versus event log time; immutability; durability of event logs; organizing enterprise architecture around the event log; the log-centric approach to integration compared to point-to-point; in place updates versus append only; extracting event logs from an RDBMS; reconstructing RDBMS change logs as business events; the event log as the system of record; making the event log highly available; scaling the event log.
Robert Blumen is a Dev Ops engineer at Salesforce. He's interviewing Alexey Syomichev, a software engineering principle architect at Salesforce, with over 25 years of experience. Their conversation begins with what constitutes an event (an immutable record encoding something that happened to an app). They then move on to describe consumers of those events, whether they're internal to an app or external to other services and business departments. Finally, they discuss ways in which you can store that event, either in a database or a running event log.
Databases and event logs each have their strengths and weaknesses, which Alexey enumerates. Event logs, for example, are easier to write to; but databases are easier to query for information. In general, Alexey believes that it can be important to organize your larger systems around an event log.
The conversation concludes with a discussion on how to use an event log in practical ways. First, you'll need to decide the schema to use when representing an event. State transitions are another reality to consider. If multiple events representing the same action come in, you need to make sure that the event log is atomic. Finally, there's a question of precisely how many events to retain, how far back you store your application's state.
Links from this episode
- OK Event Log (the architecture files episode 3) by Ian Varley
- Apache Kafka
- CQRS Pattern by Martin Fowler
- Event-driven architecture
- How Apache Kafka Inspired Our Platform Events Architecture by Alexey Syomichev
- About event-driven driven_architecture
- Protobufs and Apache Avro are two ways in which you can serialize an event.
Robert: Welcome to the Code[ish] podcast. I'm Robert Blumen, a Dev Ops engineer at Salesforce. I have with me today Alexey Syomichev, a software engineering principle architect at Salesforce with a focus on integration and platform connectivity. Alexey, welcome to Code[ish].
Alexey: Thank you, Robert. Glad to be here.
Robert: Would you like to tell the listeners anything else about yourself?
Alexey: Oh, well, you've introduced me pretty well. My name is Alexey. I've been with Salesforce for about seven years. My overall industry experience is about 25. I'm really passionate about the integration capabilities of the Salesforce platform. I'm a big fan of the platform overall and I think making it easier to integrate with Enterprise software is a big goal that I pursue for the last seven years.
Alexey: An event is really a little piece of data that describes something happening in the real world. The data could encode some actions. It could encode state changes. It could encode measurements. Really it could be anything. The event itself is usually immutable, so when something has happened, it's recorded. There's an event and then an event system will place it into some kind of a transport where the event will be available for consumption.
Robert: In a system like Salesforce, what are some of the sources of events?
Alexey: Oh, for state changes, obviously, we have a database. Salesforce stores a lot of data for our customers. When that data changes, we do generate events. Another source of events is all kinds of telemetry that we collect about the functioning of our system. Those measurements are events that are available for analytic systems and monitoring systems.
Alexey: Also, there are events that are emitted in a form of application logs and they're mostly consumed by people or other kinds of analysis systems, but application logs is kind of like a degenerate case of events in general.
Robert: So you partially answered this. Let's focus on business events that the first category events, you talked about things going in and database. Who are the consumers of those events?
Alexey: This is an interesting and a deep question. Normally when we think about a system that has a database and it's in the middle. Any change to the database producers in an event that may be interesting to a system that integrates with that state. The other system usually reasons about the state of the overall integration in the terms of that same database. So for example, we may have a bunch of accounts recorded in the Salesforce database and an external system that runs the customer's network that needs to replicate those changes and maintain the local replica of the state. So in this case we have a synchronized set of data sets that span a distributed system.
Alexey: Another way to think about this is when we consider events the primary source of information about what's happening in the system, what's happening with the business. And then each system or each microservice that participates in this complex system will maintain its own persistent footprint and will lay out the data the way it wants.
Robert: If I understand then you'd have different systems that are performing different services about the same business transaction. And they all need to know when something happened, so you're only going to record it at the point of our origin, but other entities, accounting, security, different business systems need to know that the customer say upgraded their account status or a relationship was terminated. Am I getting that right?
Alexey: Yeah, that's correct. For example, an audit system may want to know when the change in the account happened, but it will not have to record the entire state of the account. The fact that the account has been upgraded is important for the auditing system and it will store data consistent with the task at hand.
Robert: We're going to be moving on to the role of an event log in architecture. The next stepping stone would be the concept of an event log. Can you explain that?
Alexey: An event log is simply a way of putting events into a persistent store in a time ordered fashion. Event log organizes event in a strictly append only way. This is a very powerful storage abstraction that allows systems to reason about shared state in a very consistent way.
Robert: What would a system look like that's not append only? What does that mean for it to be append only?
Alexey: Well, the alternative is if you think about the database. The entire table... Let's assume the database is very simple, consists of one table, but the table consists of rows and each row has its own state. A write can happen and any part of the table it's usually identified by some of some kind of a key and the write can go at any place of that state.
Alexey: Append only means that the data can be written only at one side of the data structure. The data structure itself is conceptually much simpler. A system that produces data right at one end and consumers go from another end and try to catch up with the producing system.
Robert: You're choosing between, I could have a database table write all the events there or have an append only log. What are advantages? You said simplicity. What does that buy you to have append only data structure?
Alexey: It buys a couple of things. In terms of implementation, appending data to one side of a dataset is significantly easier for computer systems to implement in a highly performing fashion. If you think about disc drives, for example, random access to data requires... They're positioning the heads of the disc drive. Writing data sequentially at the end of a data set means that head doesn't have to move far, it can stay where it is and continue writing data and just adjust its position a little by little. Well, disk drives are not necessarily used widely today, but the analogy still holds for modern storage systems, caching systems and processing systems. So there is a performance advantage in writing at one side of the data set rather than at a random place.
Alexey: Another advantage to this is that by producing events and appending them to the log, a distributed system can establish a strong order of events, strong order of facts that it tells about the state of the overall distributed system. It introduces a notion of a global clock into the state of the system, so by reading the same log, multiple participants of a distributed system can arrive at the same state very deterministically. It does not depend on the local time. It does not depend on any other conditions. If we think about this as a state machine, the logs present all the transitions of the state machine and if we apply the same set of rules, we arrive at the same results.
Robert: You're describing the performance characteristics of these logs are superior in terms of the way storage work. What are volumes of events we would be talking about in a large enterprise or a large SasS provider?
Alexey: It really depends on the use case, but I've heard of cases of using event driven systems and event log oriented systems at scales of petabytes a day. So we can go pretty high. That scale may not be achievable with a single, strongly ordered logs, but there a bunch of interesting optimization tricks that allow it to scale the systems up. We can talk about them as well.
Robert: Okay. We may get to that. I don't know if you could do an apples to apples comparison, but how would that compare to the volume of updates or inserts you could do on a large relational database running on a pretty big server?
Alexey: That's a very good question. I can probably relate this to how Salesforce itself, Salesforce platform produces change events related to updates and creates apply to the Salesforce database. So we store a ton of data. We have many many tenants running on the Salesforce platform and we also produce change events that are available for consumption by other systems. We see that we produce tens of thousands of events each second coming out of those systems. It means that the database gets updated very, very frequently. So relative to the footprint of the table itself, the volume of the logs could go many many times higher and probably many orders of magnitude higher than the footprint of the table itself. The data keeps changing all the time.
Robert: We've got some good foundations now telling about events and event logs. Why would you want to make the event log the center of your architecture and what does that architecture look like that is organized around an event log?
Alexey: The architecture that is organized around the event log has full consensus across all participants of a distributed system. We can write the log once and then everyone can consume that log and establish facts about the state in a very consistent form.
Alexey: Another important property of the log is that there could be multiple subscribers to the log, multiple consumers of the log. And the system can grow in a very scalable way without producers of the log even being aware that there's a new consumer coming in and starting to read the log. There is no dependency between the systems consuming the log. All they need to do is remember their position in the log. So an enterprise system can grow rapidly without any kind of contention between different projects.
Robert: You have multiple systems, they're producing events, writing them to the log and consuming them. You put the log in the middle of the system to understand why that might be a good thing to do. How else would you solve that problem if you didn't have an event log? What are some alternative architectures?
Alexey: The alternative architecture would be a point to point architecture where each source of events talks to a consumer of those events or vice versa. And we have a bunch of point to point integrations that deal with data transfer on one off basis. So this kind of architecture is very complex. There are a lot of one off cases and each time we need to add a new direction of data replication. It's a separate project that involves both the source of the data and the consumer of the data.
Robert: Well, I had Ian Varley on the last show I did. We didn't talk about logs, but this is something Ian said, I'm going to read you a quote. He said, "The log is the real deal. Everything else is just useful and efficient views on the log." Can you explain what Ian meant? That may be a little unfair to ask you to explain what Ian meant, but I'm going to ask you give it a try.
Alexey: I can take a stab, definitely. So one way to think about a state of the system is like a database, right? So we have a table, we have rows and each row represents a state that changes when the system processes business transactions. It gives us the immediate access to the state as of now, but it doesn't give us any kind of history. If we instead store all the transitions of the state in the form of event log, we can arrive at the state of the system at any point in time. In fact, we can replicate that state easily because there could be multiple readers of that log and that state could exist in multiple physical locations in a very consistent form.
Alexey: It's like if developers work with the code. When you clone a repo, it's the final state of that project, but there are a lot of changes that went into that code and each commit is a change in the state and we can reconstruct any branch in the Git repository by falling every change that was committed and different branches or different paths in that journey. But the state is represented by changes and if we keep all the changes, we can go back to any point in that state.
Robert: We have then this view of the world with different systems. You have many different kinds of business systems, auditing, accounting, producing events that are going into log downstream. Other systems can pull out the events they want. I am going to guess in a typical world these systems are heterogeneous, so is there some kind of common schema or representation that the organization needs to agree on for what the events look like?
Alexey: Definitely. For a true decoupling of systems, the look itself as a data structure gives us kind of at runtime, temporal decoupling. The systems don't have to be up at the same time in order for the entire system to work. If someone needs to restart or if a network or a computer is down and comes back online, that system just needs to restart from a position into log it left off. But in order for this to be truly decoupled in a logical sense, so that new systems can come online and reason about the facts state in the log, there has to be a common notion of what those events mean. Like a top level schema or a global information model that is superimposed on all the participants in the distributed system.
Robert: Now, are there schema languages or something like Protobuf or Swagger or some way that people define with these events in a interoperable way, what the fields are and structures?
Alexey: Yes, definitely. There are formats to describe the layout of events in terms of fields and types and Protobuf or Swagger or Apache Avro are popular ways of describing those schemas. Those schemes can be used for serializing events. So from a structured representation you can go to a sequence of bytes and in reverse easily. It's a very useful property because the underlying log management system typically deals with payloads in a form of byte array, right? So we need to write it to disc using Protobufs or Apache Avro is a very convenient way of converting an object in memory on the disc serialized representation.
Alexey: Frequently those serialization schemas do not give enough information about the business meaning of events. So sometimes we need to resort to higher level descriptions that also add business context to fields. For example, in Avro, number is a number. But if you look at higher level descriptions such as Salesforce object schema in the Salesforce platform. A number could be currency, it could be quantity, it could have constraints apply to it. And that gives some business meaning and ability for systems to reason about data at a higher level.
Robert: Many of these systems they may predate the thinking about events they were writing to a database. You want to get all the changes that were made to that system out of the database and into the log without dropping anything. What are some of the challenges and solutions for doing that?
Alexey: There are a couple of ways about that. The system itself as it applies the change to the underlying database can also produce an event that denotes a change in the data and put it on the event log.
Alexey: Another approach is when the database itself is the source of events. So all the application cares about is writing to the database and then we have a system that sits within or near the database that reads changes off the database, interprets them and turns them into change events that go on the event log.
Robert: You can do it, if I understand, two approaches within the application or you can do it from the internal structures of the database. Is that correct?
Alexey: That's correct, yes. In the application or within the database. Yeah.
Robert: What are some of the pros and cons of those two approaches?
Alexey: When you do it in the application, it's an extra load on the application tier. There's also a bit of a risk associated with this because the change is committed into the database independently if it's write to the log. So some failure scenarios could interrupt the system in the middle and you may end up with the state committed to the database, but not propagate it to the log or vice versa depending on how you do it.
Alexey: When you do it through the database, you're reading changes that have occurred already. The database recorded them and many databases use log internally to ensure transactionality and recoverability. So there are ways to read that log. The problem there is the log is expressed in a very physical sense of the database table structure. Sometimes that needs to be interpreted so that the log can be represented in a more abstract way, but then that informational model that is applied across multiple systems so that other systems can read the log and understand it.
Robert: I've looked at some other trends in architecture, Alexey, which turned this thing on its head and say, let's have the application write into some kind of an event stream and then we'll capture that if we want it to be in SQL, which often you do have to capture that a little bit further downstream. How does that model compare to the direct database application? What do you think of pros and cons?
Alexey: The pros is that it's definitely more scalable because we don't have the database and we don't have a transaction to wait for. Individual applications participating in a distributed system tell the facts about what's happening with the business in a form of events and those events are written to the log. And any participant of the system can read the log and reconstruct its relational view of the state if it's necessary for processing.
Alexey: The challenge is that now we have a slightly different level of consistency. If we have a database in the middle, that database is atomic, consistent, and durable. And we always, within the transaction, can establish all the invariance of the systems and preserve them. The database itself guarantees integrity of the data.
Alexey: If we have a distributed log, then all the facts that are expressed on the log are laid out in a time automation and we can reconstruct a consistent state and that's very deterministic, but the consistency is eventual. Different systems may read events from the log at different pace. Different systems may need to make decisions about processing based on the state of other system and that state may be slightly behind. So we need to admit a level of eventual consistency in the design and we need to be able to counteract that if some temporary and consistencies happen by issuing corrective actions.
Robert: Meaning in the case of the database, if you write it, you can read it and it's there. You can put it on the log you may have to wait a little while for it to show up in something that you can efficiently query?
Alexey: That's correct. There's also the stronger implication of that in the database. Let's use a specific example. We have a customer who calls and wants to place an order to purchase a pair of shoes. In a database centric design, we look at the inventory table and vanilla pair of shoes is there. We opened the transaction and at the conclusion of that transaction we know that the order object has been inserted as a row into the order table. The inventory of the shoes has been decremented and the overall transaction has been committed. Our system is in the consistent state.
Alexey: If we use event logs to communicate between the ordering system and the inventory system, the ordering system may have vague notion of how many pairs of shoes are left there, but there are multiple orders placed concurrently and it may not know the exact amount of the inventory at this present time. So if we use log to design such a system, the ordering system needs to place an order of event on the log. The inventory system needs to deduct from the remaining inventory and if the inventory goes negative, that order has to be declined, but that cannot happen in real time. It cannot happen as a single transaction. These state transitions are distributed across multiple parts of a system and if a customer placed an order and we cannot fulfill the order, we need to go back and issue an apology.
Robert: Last main point I want to cover is the idea of system of record. If we're putting the log at the center of the architecture, what kind of requirements does that create in terms of how durable it needs to be in the management?
Alexey: Yeah. For the business to remain functioning, we need absolute durability of the events. We need to make sure that events written to the log will not disappear if a single machine fails or even if a single data center gets isolated for during a natural disaster or something. So the log system needs to support various kinds of disaster recovery scenarios and replicate data multiple times so that the business can continue.
Robert: If you treat these logs as equal importance to the database in a database centric architecture, do you need to do cross data center replication?
Alexey: Yeah, absolutely. If you want to continue your business, even if one of the data centers is down, you need to have a replica of the log elsewhere in a different data center and potentially in multiple places. Fortunately, when we have the log replication like that is relatively easy because we don't have to do it real time. The replication itself could be a consumer of the log that can recover and restart after a network hiccup.
Robert: How far back do you keep all the events?
Alexey: It depends on the use case. If we model our system where the entire state is represented primarily by the logs, we want to think about it as an infinitely deep log, but of course it's not practical, right? We cannot keep all the events from the beginning of time. So the typical way of solving this is by rolling out the state from the beginning of time into a stable snapshot.
Alexey: Usually it's very business driven so there is no single recipe, but usually those are all ops happen to the point in time back enough in the past that the business doesn't really care about individual events. So we can start with the initial state and then see individual events in the middle from that point.
Robert: It sounds like, Alexey, you're putting the entire enterprise data into this log. Many different systems, everybody's going to be writing to it. How do you make that scale?
Alexey: Yeah, that's an interesting question. A single time ordered log is a very useful, logical construct, but in practice to make it scale we definitely have to take some measures. One of the most popular measures is partitioning where a single logical log is split into many independent substreams that do not interact with each other. They can be written independently, they can be stored independently and different subscribers can process them in exactly the same way but out of a whole pool of workers. So this way we remove contention for the storage and we also compare lice processing at the same time.
Robert: Is this different than saying you have multiple logs?
Alexey: Multiple logs would mean different meaning of each log. So for example, in Apache Kafka, there is a notion of a topic and there's a notion of a partition of that subtopic. The difference is that within the same topic, you have multiple partitions, but the meaning of events and the processing logic that a system would apply to those events is exactly the same. It's just they go in parallel and they're stored independently. Different topics, but the same time, they would mean different things and processing of different topics will be fundamentally different. So I think it's useful to separate these two notions.
Robert: Well, Alexey, thank you very much for appearing on Code[ish].
Alexey: Oh, thank you. Robert.
A podcast brought to you by the developer advocate team at Heroku, exploring code, technology, tools, tips, and the life of the developer.
Lead DevOps Engineer, Salesforce
Robert Blumen is a dev ops engineer at Salesforce and podcast host for Code[ish] and for Software Engineering Radio.
More episodes from Code[ish]
Tobie Langel and Chris Castle
When you think about it, it's kind of miraculous how the web as we know it works as well as it does. No matter the device, browsers are able to render sites fairly accurately--even those built over twenty years ago. How this happened is no... →
Matt Pfaltzgraf, Brian Wetzel, and Julián Duque
More and more people are turning towards fundraising platforms as a way of contributing to the causes they care about. Softgiving is one such service which prides itself on working closely with influencers, pairing them with charities that... →
Doug Fawley and Robert Blumen
A Remote Procedure Call (RPC) is a protocol for communication between a client and a server, and, while it's not a new concept, Google has evolved the idea with their own version: gRPC. Learn more about gRPC from Doug Fawley, who is the tech... →