Making your system fault tolerant and available
Availability and fault tolerance are software qualities that are at least somewhat important for every architecture. What's the point of creating a software system if the system can't be reached? In this section, we'll learn what exactly those terms mean and a few techniques to provide them in your solutions.
Calculating your system's availability
Availability is the percentage of the time that a system is up, functional, and reachable. Crashes, network failures, or extremely high load (for example, from a DDoS attack) that prevents the system from responding can all affect its availability.Usually, it's a good idea to strive for as high a level of availability as possible. You may stumble upon the term counting the nines, as availability is often specified as 99% (two nines), 99.9% (three), and so on. Each additional nine is much harder to obtain, so be careful when making promises. Take a look at the following...