My previous post detailed what we mean by event streaming and differentiated this from other similar technologies. I also made the point that providing a business with the ability to reason on top of and act on all events happening across all lines of business in real time can solve a multitude of problems.
This concept opens up a very broad window, with a slew of potential use cases and benefits, some of which would be unique to certain industries and some broadly applicable.
In this post, I am going to focus on the use of event streaming specifically in a microservices context. This is a paradigm shift in how many businesses are dealing with data and inter-microservice communications.
What problems are we looking to solve in a Microservices context?
One of the key aspects of microservices is the concept of autonomy. This is the idea that each microservice has its own code base and can be deployed mostly independently from the other microservices which make up an application.
Microservices can be run by different teams and the fewer dependencies between services give these teams increased autonomy. This ultimately leads to a number of benefits such as faster development cycles and the ability to scale more easily.
In truth, there is however never complete independence. Microservices need to communicate and share data with one another. This often results in a cascading effect where the common request/response methods of synchronous interaction between microservices start increasing in complexity and in turn start making services increasingly more dependent on each other.
On the one hand, microservices need to hide their internal state and be as loosely coupled as possible, but on the other, they also need to have the freedom to slice and dice shared data.
This tension unfolds in a number of different ways over time. Typically a variety of APIs are gradually added to a services interface to expose data. Eventually, it is typical to see centralised data services being formed into which all the data is pushed. Dealing with this can rapidly become quite tricky, especially as the complexity of the system increases; A client service may need to join data from a number of services and may need to write to a database or develop some process which does it. It also makes testing become difficult because all the services need to align from the start, which usually means repetitively dropping and recreating tables with test data.
Another way this tension may evolve is that as services scale, the dependencies on data from other services inevitably slow down development cycles. This leads to teams moving whole data sets into their own service to give them more autonomy. This can lead to data divergence over time, especially when dealing with multiple mutable copies of data across the landscape.
What is the solution to these problems?
In truth, there is no perfect solution to this, however many businesses are turning to event streaming as a middle ground between shared databases, messaging and service interfaces.
The high-level concept in this adoption is the separation of reads and writes using immutable streams.
There are many patterns available in this adoption and I will cover only the high-level principles involved in this post. There is a great series of talks done by Confluent that addresses this in a lot more detail, which I will link to at the bottom of my post.
The core principle with event-driven microservices is that they interact through events rather than communicating with other services.
In a request/response model, an order service may request a payment service to process a payment which in turn may call a stock service to update stock.
In an event-driven model, orders would be published by the order service to a topic in the event streaming platform. The payments service would in turn be consuming these events from this topic and respond to the event by processing the payment, likely through an external provider. Payment events will then be published to a payments topic which is consumed by the stock service to update stock.
This is a simple example that illustrates how these services have effectively been decoupled reducing dependencies on one another. They are effectively asynchronous and do not know about each other.
This model also enables new services to be added far more easily as there is no need for other services to change. The new service simply taps into the existing stream of events.
Reconstituting the state of a service is also straightforward as the events can simply be replayed from the stream which persists them. This is relevant to new services or potentially existing ones which may need to reconstitute the state.
Furthermore, scaling concerns are transitioned from services to the brokers. Event streaming platforms are inherently scalable, fault-tolerant and multi-tenant in nature.
When considering the tensions between independent services and shared data, this model serves as an efficient and scalable middle-ground, where services can share data in an asynchronous manner which removes dependencies and improves overall autonomy.
Dimensional Data
There is however something which has not yet been addressed.
There are two main forms data can take in these architectures. ‘Facts’ comprise the bulk of the data and constitute the continuous streams of events which occur as a service performs its function, which is what I have described up to now.
‘Dimensional data’ is typically data which needs to be looked up by a service such as customer information which may be relevant to a shipping service, for example.
When considering dimensional data, the natural thing to do would be to do a REST lookup on the customer service. This however takes us full circle back to services being dependent on one another.
This can however be done in an event-driven way where views or tables are materialised in a service off the back of an event stream. The series of Confluent talks referenced at the bottom of this post break down all the patterns which can be used to achieve this in detail, but essentially this allows services to materialise local tables, based on their own domain models, to do lookups without the need to call external resources.
These tables effectively serve as a cache which can be regenerated at any time from the event streaming platform.
Legacy systems
Apart from building new applications using modern microservices architectures, a lot of businesses need to deal with the modernisation of existing legacy systems.
Event streaming can play a very important role in these modernisation efforts. With multiple connectors to legacy message queueing and mainframe systems, including Change Data Capture (CDC) connectors, data can be migrated off these systems into an event streaming platform.
This can drastically accelerate microservices adoption as those streams can be used immediately in your microservices landscape.
Schemas
One question which may have cropped up in some people’s minds while reading this is: if the decoupling between services is achieved, how does a consuming service know that the producing service is not going to change something in the event structures that could potentially break their service?
This is typically managed by a schema registry which essentially is a centralised registry of schemas which ensures that schemas do not need to be managed and shared between services further reducing dependencies.
Discussing schema registries in depth is not in the scope of this post but in short, it maintains schema compatibility, version control and quality assurance between consuming and producing microservices.
Benefits
To wrap up, I will summarise some of the benefits that event-driven microservices bring to businesses.
When we discuss this, it is important to note that although there are clear benefits which are detailed below, the architectural pattern is typically wrapped in other higher-level initiatives such as legacy and application modernisation. Using event streaming is effectively a way to better and more effectively accomplish the goals set forth in microservices initiatives and therefore share some of the same underlying business benefits.
The benefits below are technical in nature but all relate to the higher-level initiatives and result in a faster time to market, reduction in costs and risks:
- Event streaming allows for better decoupling of services and also removes bottlenecks and dependencies on other services. In addition, it includes centralised schema management
- Event streaming provides the ability to build stateless microservices, with the state being maintained on a centralised streaming platform
- Event Streaming improves testing and facilitates the seamless introduction of new services. This is achieved by using the capability to replay events from any offset and the decoupling of producing and consuming services
- Event Streaming enables the development of asynchronous microservices, meaning events are published once by a service and consumed asynchronously by many services. This in turn allows writes and reads to be independently scaled
- Event Streaming drastically improves scalability. It does this by making scale a concern of the broker and not the service
- The Event Streaming platform forms a central immutable narrative of events across all microservices
- Event Streaming caters for facts and dimensional data, with several potential mechanisms and approachesThe use of an event streaming platform means considerations such as scalability, fault-tolerance, and load-balancing can be offloaded from the services
- Event streaming accelerates legacy modernisation by migrating or offloading data from legacy messaging queues and mainframes to the event streaming platform
Where to from here
Should this be a topic of interest to anyone reading, I strongly advise you to watch the series of talks by Confluent on this topic. Most of the premises for my post came from these sessions.
Please find the session links below:
Part 1: https://www.confluent.io/resources/online-talk/janmicroservices-and-apache-kafka/
Part 2: https://www.confluent.io/online-talk/building-event-drive-services-with-apache-kafka/
Part 3: https://www.confluent.io/online-talk/putting-the-micro-into-microservices/
Should you wish to explore this further, please feel free to get in touch with us by following this link
In future posts, I will explore other use cases, including focusing on specific industries.