(Un)coupling in distributed systems - Part 2

The effects of temporal coupling

Uwe Friedrichsen

9 minute read

A jellyfish lying on the beach

(Un)coupling in distributed systems - Part 2

In the previous post, we started to discuss a specific type of coupling, the coupling between processes in a distributed system. We discussed the fallacy that loose technical coupling, i.e., using a message-based communication style is sufficient to ensure loose coupling between processes. We learnt that instead we need to implement loose coupling at a technical and a functional level to actually become loosely coupled.

In this second and final post of this little blog series, we will discuss the redundancy fallacy and the 3rd type of coupling, we need to consider in the context of remote communication, which is temporal coupling.

Let us start with the redundancy fallacy.

The redundancy fallacy

If we look around, we realize that the typical response to tight functional coupling is redundancy: Simply run multiple instances of the same process (typically called a “service” these days).

The reasoning behind this mitigation strategy is that if one instance should fail, the other instances still work. While this strategy helps to mitigate some failure patterns like, e.g., a crash failure or transient latency of one of the process replicas, it is not a drop-in, fix-all band-aid for tight coupling:

  • It does not help in other failure scenarios like, e.g., problems with the underlying database system, software bugs, data errors or network failures.
  • It introduces new failure modes that did not exist before – just think about data consistency challenges in all its flavors.
  • It adds complexity to our system landscape as we need to deal with things like load balancers, health checks, heart-beating, data replication and reconciliation, and more.
  • Last but not least, it is evidence of incapacity from an ecological point of view: We create an environmental footprint at least twice as big as needed just because we fail to ponder how to reduce functional coupling.

This does not mean that redundancy is pointless. Not at all! It has a significant value and sensibly used, it can significantly enhance the robustness of our system landscape. But neither it is a sufficient countermeasure against the effects of tight functional coupling nor it is a legitimate excuse to ignore it.

The mysterious messaging advocates

We have seen that loose coupling requires both, loose coupling at a functional level and loose coupling at a technical level. This raises the question why so many people in IT only talk about the technical coupling, the communication style employed.

Some of them probably just think in terms of technology and fail to ponder the functional level if they discuss IT issues. We all know such people. But this cannot be everything. There are a lot of people who have repeatedly proven not to be that narrow-sighted and they still recommend message-based communication to foster loose coupling. How does that fit in?

While I cannot say for sure what the underlying reasoning of those people is, this is what I think why they talk about message-based communication so much: Changing the technical communication style from request-response-based communication to message-based communication also tends to affect the functional design. Especially, it favors designs that reduce temporal coupling. And reducing the temporal coupling usually also requires reducing the functional coupling.

In other words:

Going for asynchronous message-based communication will foster loose temporal coupling which in turn pushes in the direction of loose functional coupling.

That is exactly what we want: Loose functional coupling. We just introduced it though the back door.

Temporal coupling

But what exactly does loose temporal coupling mean?

Loose temporal coupling in an inter-process setting comes in two basic flavors:

  1. The processing of an external requests and the access of other processes required to process the request are temporally decoupled.
  2. The external request, its processing and the response are temporally decoupled.

In the first variant, a process (typically a service) does not call other services while processing an external request. Instead, it makes sure that it has all information needed to process the external request already available before the request arrives. I.e., the gathering of information from other processes is temporally decoupled from the processing of external requests (typically using means like data replication and caches).

The external request still might require narrow timing boundaries which the processing process must satisfy (like in our prior example with a customer issuing a search request and expecting a swift response). But as the processing service does not need to reach out to other services while processing the external request, all imponderabilities of remote communication between the different processes involved are gone.

This type of temporal decoupling results in a loose coupling between the processes involved: To achieve the temporal decoupling, we need a different split of functionality and data between processes. This leads to a different design of the interfaces and protocols between the processes, resulting in a reduced functional coupling between the processes.

In the second variant, we decouple the external request from its response. We accept external requests but do not provide an immediate response. Instead, we send back the response in a temporally decoupled way. We either provide an endpoint the caller can use to poll for the availability of the result. Or we send the result to an endpoint, the caller provides in conjunction with the request. Or we combine both approaches: We notify the caller when the response is available and the caller then polls our endpoint to retrieve it.

This way, we do not need to satisfy narrow timing boundaries an external caller might impose on us otherwise. This is usually only possible if we design a functionally loose coupling between the external caller and the process called, leading to a changed interface and protocol between external caller and process. So, again we achieved loose functional coupling through the back door. 1

Both variants are useful. However, it is important to note that they work at different places. From the perspective of a process, the first variant affects the coupling at the outgoing communication side (sometimes also called the “southbound” or “internal” interfaces) while the second variant affects the coupling at the incoming communication side (sometimes also called the “northbound” or “external” interfaces).

More design options

The temporal decoupling at the incoming communication side gives us additional degrees of freedom how to design the interactions between the processes involved in processing a request (the outgoing communication side). We can organize this internal processing as it works best for us. Typically, we use the freedom to also aim for a more loosely coupled interaction between the services, i.e., we try to pass on the temporal decoupling from the system boundaries all the way down the processing chain.

However, with a temporally decoupled incoming communication side, we could also leave our internal communication design tightly coupled if we prefer it: If any process required to complete the processing should be transiently unavailable, we could simply wait until it becomes available again. Due to the external temporal decoupling we can afford to wait for it. Of course, this is not exactly what we have in mind if we talk about loose coupling. But it nicely demonstrates how loose temporal coupling gives us more design options.

Temporal decoupling gives us also many additional options regarding our technical communication style we often miss. We are not limited to message-based communication (no matter if using commands, events or documents).

We can also fall back to much simpler communication styles like, e.g., traditional batch processing: We can use good old-fashioned offline communication means like files, transfer tables or alike to communicate between processes. This way we can completely avoid all imponderabilities of remote communication, even the ones that come with message-based communication. Instead, we can rely on fault tolerance mechanisms that are known and production-tested for decades.

We can also use protocols like, e.g., ATOM feeds which logically implement batch-style updates triggered by the receiver, technically using a request-response style remote communication. While the communication style feels more up-to-date, the trade-off is that we need to deal with the imponderabilities of remote communication again.

We can use a distributed log like, e.g., Kafka to share updates. While this does not only feel up-to-date but really “hip”, it leaves us with the burden of running a complex piece of additional middleware and a harder recovery if it should fail.

And so on. The options are vast and each one comes with its specific trade-offs: From traditional file- or database-based batch processing to event sourcing using tools like Kafka. While it is important to carefully ponder the trade-offs of the options in production (development preferences and trade-offs are of minor relevance when it comes to the runtime properties – especially robustness – of a system), the key point is that the introduction of loose temporal coupling at the incoming communication side gives us a lot more design options.

But no matter which communication style we choose, in the end our functional design must support the temporal decoupling. We have a mutual dependency between functional and temporal coupling:

Temporal decoupling fosters functional decoupling because without functional decoupling we usually cannot achieve temporal decoupling.

And so we are back at the observation from the beginning of this section: Going for asynchronous message-based communication will foster loose temporal coupling which in turn pushes in the direction of loose functional coupling.

Summing up

We started with the observation that coupling is a big issue in software design. We typically need some degree of coupling to get a job done but tight coupling has a series of drawbacks. Therefore, we try to keep the coupling between system parts as low as possible. Especially, we try to avoid accidental coupling, i.e., coupling that is not needed to get the job done.

In these two posts, we discussed a specific type of coupling, the coupling between processes:

  • We discussed the fallacy that loose technical coupling, i.e., using a message-based communication style is sufficient to ensure loose coupling between processes.
  • We have seen that loose technical coupling is pointless without loose functional coupling.
  • We learnt we need to implement loose coupling at a technical and a functional level to actually become loosely coupled.
  • We discussed that redundancy is not sufficient (and usually also not sensible) to compensate a lack of loose functional coupling.
  • We have seen that temporal loose coupling is a good way to bring in loose functional coupling through the backdoor.
  • Finally, we learnt that temporal loose coupling gives us many more options regarding the design of our systems and thus is something we always should keep in mind when it comes to system design.

This was a lot of stuff. Nevertheless, there would be a lot more to say about coupling, even if we only focus on coupling between processes (typically “services” these days). E.g., we did not discuss how more subtle types of coupling like concept leakage affect the coupling between processes. And it would be particularly interesting to discuss how to design for loose functional coupling. But this post has become long enough already. So, maybe this is a topic for some future blog posts. We will see … ;)


  1. We leave out the relatively rare cases where a tight functional dependency between a caller and a callee exists but for some reason it does not matter when the callee answers a request. While these cases exist, much more often a tight functional coupling also leads to tight temporal boundaries, i.e., tight temporal coupling. ↩︎