The "Big Misinterpretation" of DDD
The theory being that the closer we are to the domain in our implementations, the better our software will be. With this goal in mind, DDD has been extremely successful throughout our industry.
But, as architects and designers, we have an embarrassing love affair with making early decisions on the structure of our software.
This emphasis on structure was NOT the intention of DDD.
Domain Objects became a massive problem as they entangled multiple concerns across a software design in their goal of being ubiquitous. They became Entities, Message Payloads, Form Backing Objects … they were promiscuous across the system and a source of entanglement and system fragility.
The situation got even worse as people packaged Domain Objects into libraries for reuse effectively increasing their footprint and increasing the ossifying effect of the Domain Objects being the brittle part of your system that could not evolve now that it is being used everywhere.
Shared domain object libraries could not be evolved without everything evolving in lock-step.
You could detect the problem by encountering the unintended, and un-wanted, 40-file commit Ripple Effect.
Then there was an epiphany!
It is not the things that matter in early stages of design…
…it is the things that happen.
We have a name for the things that happen in software design, they are referred to as Events.
This newer approach was called Events-First.
Events turn out to better capture the ubiquitous language of a domain or system. More often than not the easiest way to describe the system is in terms of the things that happen, not the things that do the work when collaborating with non-technical stakeholders,
The things that happen are the Events.
It turns out that this approach works well if you’re evolving an existing system, or working on a new one.
The technique of Event Storming is the first design step on this journey.
The intention is to try to capture a system in terms of the things that happen, the Events.
Using post-its you then arrange these events in a rough order of how they might happen, although not at first considering in any way how they happen or what technologies or supporting structure might be present in their creation and promulgation.
One technique is to think of yourself as a Detective arriving at a crime-scene and simply ask yourself of the system you are working on with your team, “What are the Facts?”
Limit yourself to events that describe what you can know, in the system you are working on and can influence.
What Makes a Good Event?
Having a LOT of events is rarely a problem.
Make your events completely self-contained and self-describing.
Make your events technology and implementation agnostic.
One you have the happy set of events, explore the events that can happen when things can go wrong in your context.
This approach helps you ask the question “What events do we need to know?”, which is a powerful technique to help explore boundary conditions and assumptions that might affect real estimates of how complex the software will be to build.
Events are immutable, after all they are Facts.
Lay Out your Events to Explore Causality
The more events that are not causally linked, the more options you have about how those events are generated and handled in the emerging structure that you’ll eventually design.
Bounding with a “Bounded Context”
The “Life Preserver” diagram is simply a useful tool made up of a couple of concentric circles that are a visual representation of your Bounded Context and Anti-Corruption Layer.
The Bounded Context is related to teams and team structure.
Finally, To Structure…
Second, consider using Repository services to capture state where the model for modifying the state as well as reading it is the same.
Be wary of leaky abstractions around data persistence technologies and make your repositories agnostic of whatever technology you choose.
Finally look to implement microservices using CQRS if you need to vary the performance characteristics, and models, for modification and enquiry of state…
Beating Complexity in State
The second biggest cause of accidental complexity in software is using the wrong data model for the wrong job.
Domain Driven Design (DDD) gives us the simple Repository pattern for the storage and retrieval of data but unfortunately this pattern tends to entangle the concerns of Write and Read.
This is not a problem if the models for Write and Read are the same, in fact it has some real advantages as you can ensure complete consistency over a data set in a Repository.
But when the needs of write and read are different, and they often are, we tend to need something more.
Beyond the model used, it turns out that `writing` and `reading` have further different characteristics when it comes to:
- Fault Tolerance
- Performance and Scalability
When both the need to manipulate and read data are entangled we end up with `systemic complexity` that in turn results in a system component that resists comprehension and change.
We need to disentangle these concerns if we can in order to increase the simplicity of the system so that we can develop, and operate, the software more easily.
Separating Concerns: Command Query Responsibility Segregation
The key feature of this approach is to unpick the accidental complexity of entanglement between the write and read models into two system components: The Write and Read Models.
Commands and The Write Model
Commands are received by the write model and processed to produce events that report what modification has occurred.
Commands represent transactionally consistent manipulations of the Write Model, and should succeed or fail entirely. The Write Model will maintain whatever state is necessary, in whatever optimised form, to produce the events that it is responsible for.
The job of an Aggregate is to take a command and execute it to produce one or more valid events.
The Events represent full, complete, self-describing and immutable Facts about the system. In turn a Command ideally contains all that is necessary to produce the expected, valid Events.
In this respect it’s important to remember that the Events are not merely side-effects of processing the commands, they are the actual data result of the commands. Commands and Aggregates only exist to produce an important set of Events, facts, about our domain.
What do you do if the Command does not contain all the necessary details to produce there Events? Sometimes the processing of a command requires an aggregate to have some awareness of data, or more accurately Events, from other parts of the system. Sometimes this data can be sourced from local (to the Bounded Context) views.
Of course this can represent a race condition where an aggregate is trying to execute a command and the requisite information is not yet available in the supporting view or local cache.
This is a normal state in an eventually consistent system, where isolation and partitioning is valued at the sacrifice of system-level consistency (such as in adaptable, antifragile, microservice-based systems). This can also occur if an erroneous, or malicious, command is being processed by the Aggregate.
To deal with this the sender of the event that causes the command on the aggregate may have to deal with the fact that the command will be reported as failed. It is then the responsibility of the sender to decide, using their larger business context knowledge, whether to back-off and retry the command (one or more times with increasing delays is a common strategy) or fail the larger business operation.
In some respects this strategy is similar to using a circuit-breaker where the circuit is the interaction between the initiator and the aggregate, and eventually the circuit is broken for the business operation trying to be conducted.
In the case where the sender is a Saga then additional processing may be necessary in order to perform roll back through compensating commands.
Queries and the Read Model
In CQRS this responsibility falls to one (or more) Read Models.
A Read Model is responsible to listening to a number of different events and maintaining an optimised version of the state that can, in a performant fashion, meet the needs of responding to one or more types of Query.
Events as the Communication Between Models
What events the Read Model is interested in will be dictated by the Query or Queries that the Read Model will have to provide answers to.
Events are the Key
Reducing the Fragility of State using Event Sourcing
An event store simply reliably stores your events in a guaranteed sequence order that can then be queried and replayed.
Store your events in a robust and resilient place so that aggregates and views can be re-created from the events at any point in time.
This means aggregates and views can be fragile.
Snapshotting represents a compromise where replay is shortened, versions can be retired, but there is a deliberate amount of coupling to data migration.
Where to go next?
I shall be exploring some of these concepts as I finish the Antifragile Software book, and I shall be serialising some of the implementation examples from that book here on this blog over time.
Further reading on some of the topics discussed here are available from: