Reader small image

You're reading from  Designing Production-Grade and Large-Scale IoT Solutions.

Product typeBook
Published inMay 2022
Reading LevelIntermediate
PublisherPackt
ISBN-139781838829254
Edition1st Edition
Languages
Right arrow
Author (1)
Mohamed Abdelaziz
Mohamed Abdelaziz
author image
Mohamed Abdelaziz

Mohamed Abdelaziz is a technology leader, IoT Subject Matter Expert, Cloud expert and Architect with over 17 years of experience in IT and Telecom. He has designed and delivered many large-scale, production-grade, and multi-million dollar software and cloud-based solutions that cover both traditional IT and IoT solutions which are used by millions of users across the globe. He holds a degree in computer science and information systems and besides his proven working experience, he has multiple credentials in AWS (8 certificates) and Azure (5 certificates – including Azure IoT developer certificate). He is an advocate for cloud computing, IoT, app modernization, containerization and architecture and design of large-scale distributed systems.
Read more about Mohamed Abdelaziz

Right arrow

Chapter 9: Operational Excellence Pillars for Production-Grade IoT Solutions

Remember that you are on a mission of designing and architecting a production-grade, large-scale IoT solution, not an IoT hobby or fun project. Therefore, you should make sure that your IoT solution is all of the following: fully secure; scalable; reliable; monitored; observed and fully controlled; resilient; fault-tolerant; highly performant; and cost-effective.

IoT solutions are typically complex and contain many systems, components, and layers, so delivering the aforementioned operational excellence aspects is quite challenging.

For example, to make sure your IoT solution is fully secure, you need to start that journey by first securing the IoT devices and microcontrollers, followed by IoT edge devices, IoT gateways, and local networks, before securing communication or the transportation link between the IoT devices and IoT cloud over the wide-area network, and finally securing the IoT backend cloud...

IoT solution security

An IoT security breach is dangerous and scary. In traditional IT systems, the impact and consequences of security breaches are still not considered life-threatening impacts. Such impacts (in non-IoT solutions) typically include things such as data privacy breaches or financial impacts, but in the IoT world, the story is different. IoT will share those common security breach impacts of hacked traditional IT systems, plus some other dangerous threats that have a very severe impact on human life (life-threatening). How? Let's talk about an IoT solution in the healthcare sector. Think about a patient who has been equipped with lots of IoT devices and sensors that frequently send the patient's biometrics to a healthcare professional to assess the patient's health status and intervene quickly if needed. Now, if those devices have been hacked, what will be the situation if the attacker/hacker sends fake or false biometrics data (note: in that situation...

IoT solution monitoring

Monitoring systems are critical components in any large-scale and production-grade IoT solution. You need a centralized monitoring solution (a single pane of glass) that tells you everything you need to know about the different IoT solution operations and performance. Such monitoring systems help in the following:

  • IoT solution troubleshooting activities.
  • Real-time alerting for issues that occur, which will enable you to quickly respond to those issues that are triggered.
  • Proactively identifying the trends and anomalies from the IoT solution logs, traces, and metrics. This is where IoT solution monitoring systems help in securing IoT solutions.

Before we go further with the typical monitoring system architecture that you could use for your IoT solution, we need to first understand the following concepts:

  • Metrics are data points that can easily be quantified. Metrics are things such as system metrics (that is, CPU, memory, disk utilization...

IoT solution high availability and resiliency

As an IoT solution architect, the first thing you do before starting the design of a highly available and resilient IoT solution is to ask the business stakeholders the following questions:

  • What are the expected and accepted Recovery Point Objectives (RPOs)?
  • What are the expected and accepted Recovery Time Objectives (RTOs)?
  • What are the expected and accepted Service-Level Agreements (SLAs) in terms of system availability and performance?

Let's understand better what RPOs and RTOs are with the help of Figure 9.3:

Figure 9.3 – RPO and RTO

Typically, in large-scale and production-grade solutions, there is a backup activity that runs continuously to back up the system data (for example, databases and files) and system software artifacts such as golden virtual machine templates or container images and other artifacts.

Backup is a must-have practice in any large-scale and production...

IoT solution automation and DevOps

To understand why automation is such an important aspect of best-in-class IoT solutions, think about the following: you have designed and built a highly available, scalable, resilient, and fully monitored IoT solution. One day, you get a notification or an alert from the IoT monitoring solution that there's a software bug or a security vulnerability discovered on the IoT device, IoT Edge, or in any IoT backend component. Now, if you don't have an automated process and tools in place to enable you to quickly release a software patch, then you lose all the benefits you get from a system like an IoT monitoring system. What is the point of being notified about issues if it takes such a long time to fix such issues?! Automation is what closes the circle of having a good IoT solution with high operational excellence.

Delays in releasing software patches to your IoT devices or the IoT cloud could have severe impacts on the company's brand...

Summary

In this chapter, you have learned about different non-functional aspects of IoT solutions such as security, monitoring, high availability, reliability, resiliency, and automation.

We have covered each aspect of such non-functional or operational excellence aspects in different IoT solution layers (IoT devices, IoT edge, communication, and the IoT cloud).

In the next and final chapter of this book, we will recap and summarize what you have learned by going through some of the E2E IoT solution reference architectures provided by different hyperscale cloud providers.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Designing Production-Grade and Large-Scale IoT Solutions.
Published in: May 2022Publisher: PacktISBN-13: 9781838829254
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Mohamed Abdelaziz

Mohamed Abdelaziz is a technology leader, IoT Subject Matter Expert, Cloud expert and Architect with over 17 years of experience in IT and Telecom. He has designed and delivered many large-scale, production-grade, and multi-million dollar software and cloud-based solutions that cover both traditional IT and IoT solutions which are used by millions of users across the globe. He holds a degree in computer science and information systems and besides his proven working experience, he has multiple credentials in AWS (8 certificates) and Azure (5 certificates – including Azure IoT developer certificate). He is an advocate for cloud computing, IoT, app modernization, containerization and architecture and design of large-scale distributed systems.
Read more about Mohamed Abdelaziz