Reliability is a measure of the confidence in a system, and is inversely proportional to the probability of failure.
Reliability is measured using several metrics:
The easiest way to increase reliability is to increase test coverage of the system. This is, of course, assuming that those tests are meaningful tests.
Tests increase reliability by:
- Increasing MTBF: The more thorough your tests, the more likely you'll catch bugs before the system is deployed.
- Reducing MTTR: This is because historical test results inform you of the last version which passes all tests. If the application is experiencing a high level of failures, then the team can quickly roll back to the last-known-good version.