The Missing Post About Event Sourcing

You know the basics but can’t turn theory into practice? Let me explain to you why.

1. Why this post

There are a lot of resources about Event Sourcing out there, so why yet another blog post?

Simply because I wish I could find such explicit and practical information when I started diving into ES. This post comes fresh from my experience of implementing a production system using ES in some of its parts.

The goal of this post — and my wish for you — is that after reading it you will:

know whether you should use ES to model a part of your business;
remove the confusion around modeling, storing and replaying events;
uncover some fundamental yet not so much advertised ES concepts and practices;
hopefully get some a-ha! moments.

Now let’s get started.

Let’s skip the basics
A well-known joke about Event Sourcing says it is like teenage sex. Everyone talks about it but very few actually ever did it.
Therefore I’ll assume you already know the basics. If that’s not the case, be my guest.

2. Where should I use Event Sourcing?

Let me give you some cases from different domains:

a Product is added to the Catalog of a Marketplace by a Seller, and it has to go through a Quality Control Process;
a Customer registered to an online service and she is offered a Discount Period as part of her Onboarding;
a new Request is made by a Customer to an online cleaning service and one or more Cleaning Persons must accept the Request within 24 hours; if no one accepts, the Customer is given some alternatives and she has further 24 hours to accept one of those or decline.

What do these cases have in common? The answer is: time.

Event Sourcing can be successfully applied in any time-intensive part of your business or, in other words, where there are “correlated things” happening one before or after the other.

But time isn’t the only go-for-it criterium for Event Sourcing. Or better, I find the time heuristic a bit vague. The one that usually works for me, the one that gives me the answer to the question “should we use ES in this part of the domain?” is: look for a Process.

Is there any business process going on here? Is there any task that involves one or more business actors around a particular topic? Are there different activities that have to happen in a specific sequence in order to accomplish a business goal? Do we need to consider any calendar event (e.g. end of the month) as a part of the process?

If the answer to the aforementioned questions is yes, then congratulations. You found a place in your domain where you can happily apply Event Sourcing.

Where should I NOT use Event Sourcing?

Entities with a CRUDish lifetime; so no, coming up with ThingWasCreated, ThisWasUpdated and ThingWasDeleted events is not an option;
Supporting Subdomains such as Authentication and User management, unless maybe you are in the Auth business (for some reason a lot of developers try this one as their first attempt, so I’m explicitly marking this as a no-go);
In general, when there are no time relations or business processes in sight.

3. Ok, I have a Process. What’s next?

Alright. Let’s say you found a business process to be implemented with Event Sourcing. What’s the next step?

The next step is to model the business process in your solution space.

When you generally read about Event Sourcing, you’re told to discover the Events relevant to the process. This is true and a first mandatory step, but alas not sufficient at all. Putting an event sourced business process in place requires more than that.

I’m throwing now a list of things you have to think about when implementing an event sourced business process with the promise to explain everything in details later in the following paragraphs.

In order to get started, you have to define:

which event starts the process;
what’s the “subject” of the process;
how participating actors manifest their intentions;
how the passage of time influences the process;
how the process reacts to intentions of actors and time passing.

4. Starting the Process

Every process starts with an event and this event is strictly correlated with a subject.

This subject is typically an Entity in your domain, and its ID will typically become what’s called the Correlation ID of the process.

What does it mean? It simply means that the topic — or subject — of the process will be the Entity whose related event starts the process.

Let’s make some examples using the previously mentioned cases:

a Quality Control process is started whenever a ProductWasAdded to the Seller Catalog. The Product ID becomes the Correlation ID of the Quality Control process;
a Customer Onboarding process is started whenever an EmailAddressWasVerified. The Customer ID becomes the Correlation ID of the Onboarding process;
a Cleaning Appointment process is started whenever a RequestWasSubmitted. The Request ID becomes the Correlation ID of the Cleaning Appointment process.

Intentions

Typically, when reading about Event Sourcing and CQRS you are presented with two fundamental building blocks: Commands and Events.

The former express an intent — or a request — of an Actor and may be accepted as valid or not by the Process depending on the built-in business constraints.

The latter represent something that objectively happened and from that moment on cannot be denied. It’s done. It’s in the past. Deal with it.

While this categorization is semantically complete and suitable to visualize the problem space, it represents a challenge when trying to model it into a solution in your codebase (see p. 5).

This is the reason why we need to introduce a third building block: Intentions.

Intentions represent the manifested will of an Actor. Some examples:

CustomerWantsToPurchaseTicket
WeWantToInvoicePartner
QaOperatorWantsToRejectProduct

Semantically, they are exactly in the middle between a Command and an Event. They express intention as Commands do, but they also manifest something that happened (“a Customer just told us she wants to purchase a ticket”), something that we can’t deny and we simply have to acknowledge.

This new concept evolves the {Command, Event} couple in a trio.

Using a very linear but realistic example we could picture that:

a CustomerWantsToPurchaseTicket (the Intention);
one or more business constraints are enforced by the Process looking into its State;
if the Process is ok with the Intention, it issues PurchaseTicket (the Command)
the Command is handled by whomever is responsible for producing the side effects related to the ticket purchase (as before, more on it later)
if the desired side effects were successfully produced, TicketWasPurchased (the Event) is published
the Process handles the Event by updating its State and optionally issuing further Commands.

5. Process, Handlers, Commands, Intentions, Events. Why do we need all this?

There are a couple of natural constraints when you admit the existence of Intentions in your Event Sourcing implementation.

The first one that comes to my mind is that Commands cannot be issued directly from outside of the Process. Using the previous example, no one but the Process can say PurchaseTicket. External Actors are only allowed to publish Intentions such as CustomerWantsToPurchaseTicket.

Why is that? Because the Process has to check the business constraints before issuing the Command which, in turn, will be used to produce the desired side effects.

Why not allow external Actors to publish Commands and check the constraints against them instead? — you may ask.

That’s because you want to use your Process as an orchestrator and avoid temporal coupling. Fulfilling its role as an orchestrator, the Process handles only Intentions and Events, and only publishes Commands.

Temporal coupling is a fancy term to say that whenever an Event is published, another one follows as a direct, explicit consequence. Modeling a Process by letting temporal coupling in leads to very hard constraints in terms of flexibility. You generally don’t want this.

The opposite job is done by the Handlers. Handlers are arbitrary entities in your process model that handle Commands and Commands only, optionally produce side effects, and only publish Events.

Using the previous ticket purchase example, we could define a Handler called TicketCounter which handles any PurchaseTicket Command by making a request to a payment gateway, waiting for the successful response and eventually publishing a TicketWasPurchased Event.

Let’s put altogether visually and see how it works.

6. Tic Toc

I previously mentioned that modelling Processes is about dealing with time passing by.

It’s not unusual to hear business experts say things like

“At the end of the month <this> should happen”;
“We send out those documents after closing our statement every Thursday at midnight”;
“The fast refund policy is voided four hours after the ticket is purchased”.

In the context of a Process these sentences can be translated to Events like TimeForDoingThisHasCome or FourHoursPassedSinceTicketWasPurchased.

Apparently very different as Events, these examples share a very important commonality: they have to happen at some specific, known upfront point in time.

So how do we achieve this in an Event Sourcing architecture? How do we make sure that we’ll see those Events published around the time we expect them to be?

The solution is called Delayed Publishing. You need to be able to publish a message — more precisely an Event — at some point in the future. In order to do this, you wrap the future Event in a special type of Command, which I personally call PublishAt. PublishAt contains the date/time in the future at which the wrapped message should be published. Without going too much into implementation details, a Handler takes care of listening to every PublishAt Command unwrapping the inner Event and holding it until the specified publish date/time comes. At that point, the Event is published and it’s regularly handled by the Process (which, let’s remember, is the one responsible for handling Intentions and Events).

Here’s how it works in details using the previous ticket purchase example.

Process handles TicketWasPurchased and publishes a PublishAt Command which wraps a FourHoursPassedSinceTicketWasPurchased Event;
a DelayedMessages Handler handles the PublishAt Command, unwraps the inner Event and stores it away until the publish date is due;
four hours later, the FourHoursPassedSinceTicketWasPurchased Event is published;
the Event is handled by the Process which issues a VoidFastReturnPolicy Command.

Let’s again put altogether visually.

One last point about Delayed Publishing: why not directly wrapping a FastRefundPolicyWasVoided Event in the PublishAt Command?

Again, it’s about business constraints. If you publish the Event directly, you don’t give your Process the chance to check once again its State before giving the go. Maybe the situation changed in those four hours. Maybe there are other factors in place that the Process needs to take in account before deciding that — yes — the fast refund policy for this ticket has to be refunded.

The best thing, therefore, is to simply remind the Process that Four Hours Passed. That’s undeniable to anyone. Then let the Process figure out what comes next.

7. Save everything. Replay Events.

Event Sourcing means rebuilding the point-in-time state of your model by applying all past Events. This means that, in terms of stored data, Events are all that you need for this purpose. You don’t have to store and replay Intentions. You don’t have to do that with Commands neither. This is because Events are the only type of message that can induce the Process to mutate its State, hence they’re the only ones worth storing and replaying.

But there’s an advantage of storing Intentions and Commands too along with Events: you can build a causality graph of your Process.

A causality graph is a picture of your Process. It visually tells you which messages belong together through a causality chain. Something that looks like the following.

Storing everything is not sufficient to build a causality graph, though. You have to make sure that a message can be linked to another through a causal relation. You have to store this relation as well, along with other message metadata.

If we assume that every message we store has a unique MessageId, you can add another property to every message, a CausationId, which contains the id of the message that caused that one to be published.

Taking the first few messages from the Causality Graph picture shown above, we have the following causality chain.

ProcessWasStarted(MessageId = 1, CausationId = none)
SomeCommand(MessageId = 2, CausationId = 1)
SomeEvent(MessageId = 3, CausationId = 2)
CustomerWantsToDoSomething(MessageId = 4, CausationId = none)
DoSomething(MessageId = 5, CausationId = 4)
and so on…

8. Information Immutability ≠ Data Immutability

This last tip is not really crucial to implement an Event Sourced Process. The reason why I’m including it before closing the post is that I found it useful to me, as it crushed a very strong false assumption that I had about Event Sourcing.

For some reason I was convinced that, once an Event is stored, you can’t touch it anymore. It’s immutable. Frozen forever in your storage of choice. And this assumption was even reinforced and shared by other devs in the community so why would I challenge it.

Well, turned out this is simply not true.*

You can change the underlying data of an Event Sourced Process any time you want in order to accommodate changes in your model and data fixes.

The fact that Events happen and you can’t change history doesn’t mean you can’t represent that history with a different model. This is especially true once you get a deeper understanding about your Process and you find yourself in the situation where you need to refactor something, or maybe add one more field to an Event.

Let’s not even start talking about data fixes. You’ll need plenty of them, especially when modelling new Processes or when you have a team ramping up with their knowledge and skills around Event Sourcing.

So do yourself a favour and choose a storage technology which make those kind of changes extremely easy. It’s going to get very painful if you don’t.

This is the reason why technologies like Kafka are not a good choice as an Event Sourcing storage engine. You can’t query only some Events as you always have to go through the whole topic and filter. You can’t modify a message once published, which means duplicating the entire topic and modify before publishing again if you have to fix something.

Personally I find Postgres + JSON perfect for the use case and the programming languages I use. Feel free to let me know what worked best for you.

unless you work in a regulated business where there’s a special law enforcement on data integrity. I’m thinking about betting, banking and similar. Some of them are even required to use WORM disks.

Conclusion

With this post I tried to give some generic principles, eye openers and best practices for modelling your Event Sourced Processes and deciding when is proper to model your business using Event Sourcing.

There’s a lot more to say about this topic. Since this post is already long enough, the plan is to follow up with another one about a specific implementation. It can be overwhelming in the beginning, so I have the feeling it could be helpful for developers that need further guidance on the nitty-gritty of how to start.

Please let me know whether you found this post interesting and what you’d like me to specifically cover in the next one.

Have fun 🙂