Distributed Systems
We have discussed the business case for resilient software design in my previous post. Let us assume, you have a budget and you know which are the most critical business processes/capabilities/interactions (whatever term suits your needs best) you need to secure, i.e., make more resilient.
The business case of resilience is a bit tricky. You find quite disparate forces at work: While some people tend to underrate the need and value of resilience a lot, other people find it hard to stop adding resilience measures. As so often, the sweet spot is somewhere in the middle.
I recently had a short discussion with a Product Owner after a talk I gave about resilient software design. His question was: “How can I motivate developers to care more about resilience?”
I briefly mentioned the 100% availability trap in a prior post. As this misconception is so widespread, I decided to discuss it in more detail in this post.
In the previous post, we discussed why the imponderabilities of distributed systems will hit us at the application level and we cannot leave their handling to the operations teams as we did in the past.