When cloud computing comes up in a conversation, security is, very often, the main topic. When data leaves local data centers, many wonder what happens to it. We are used to having complete control over everything, from physical servers, networks, and hypervisors, to applications and data. Then, all of a sudden, we are supposed to transfer much of that to someone else. It's natural to feel a little tension and distrust at the beginning, but, if we dig deep, we'll see that cloud computing can offer us more security than we could ever achieve on our own.
Microsoft Azure is a cloud computing service provided through Microsoft-managed data centers dispersed around the world. Azure data centers are built to top industry standards and comply with all the relevant certification authorities, such as ISO/IEC 27001:2013 and NIST SP 800-53, to name a couple. These standards guarantee that Microsoft Azure is built to provide security and reliability.
In this chapter, we'll learn about Azure security concepts and how security is structured in Microsoft Azure data centers, using the following topics:
- Exploring the shared responsibility model
- Physical security
- Azure network
- Azure infrastructure availability
- Azure infrastructure integrity
- Azure infrastructure monitoring
- Understanding Azure security foundations
Exploring the shared responsibility model
While Microsoft Azure is very secure, the responsibility for building a secure environment doesn't rest with Microsoft alone. Its shared responsibility model divides responsibility between Microsoft and its customers.
- Infrastructure as a Service (IaaS)
- Platform as a Service (PaaS)
- Software as a Service (SaaS)
These models differ in terms of what is controlled by Microsoft and the customer. A general breakdown can be seen in the following diagram:
Let's look at these services in a little more detail.
In an on-premises environment, we, as users, take care of everything: the network, physical servers, storage, and so on. We need to set up virtualization stacks (if used), configure and maintain servers, install and maintain software, manage databases, and so on. Most importantly, all aspects of security are our responsibility: physical security, network security, host and OS security, and application security for all application software running on our servers.
PaaS gives Microsoft even more responsibility. We only take care of our applications. However, this still means looking after a part of the security. Some examples of PaaS in Microsoft Azure are Azure SQL Database and web apps.
SaaS gives a large amount of control away, and we manage very little, including some aspects of security. In Microsoft's ecosystem, a popular example of SaaS is Office 365; however, we will not discuss this in this book.
Now that we have a basic understanding of shared responsibility, let's understand how responsibility for security is allocated.
Division of security in the shared responsibility model
- Always controlled by the customer
- Always controlled by Microsoft
- Varies by service type
- Data governance and rights management
- Endpoint protection
- Account and access management
- Physical data center
- Physical network
- Physical hosts
Finally, there are a few security responsibilities that are allocated based on the cloud service model:
- Identity and directory infrastructure
- Operating system
Now that we know how security is divided, let's move on to one specific aspect of it: the physical security that Microsoft manages. This section is important as we won't discuss it in much detail in the chapters to come.
Everything starts with physical security. No matter what we do to protect our data from attacks coming from outside of our network, it would all be in vain if someone was to walk into data centers or server rooms and take away disks from our servers. Microsoft takes physical security very seriously in order to reduce the risk of unauthorized access to data and data center resources.
Azure data centers can be accessed only through strictly defined access points. A facility's perimeter is safeguarded by tall fences made of steel and concrete. To enter Azure data centers, a person needs to go through at least two checkpoints: first to enter the facility perimeter, and second to enter the building. Both checkpoints are staffed by professional and trained security personnel. In addition to the access points, security personnel patrol the facility's perimeter. The facility and its buildings are covered by video surveillance, which is monitored by security personnel.
After entering the building, two-factor authentication with biometrics is required to gain access to the inside of the data center. If their identity is validated, a person can access only approved parts of the data center. Approval, besides defining areas that can be accessed, also defines periods that can be spent inside these areas. It also strictly defines whether a person can access these areas alone or needs to be accompanied by someone.
Before accessing each area inside the data center, a mandatory metal detector check is performed. To prevent unauthorized data leaving or entering the data center, only approved devices are allowed. Additionally, all server racks are monitored from the front and back using video surveillance. When leaving a data center area, an additional metal detector screening is required. This helps Microsoft make sure that nothing that can compromise its data's security is brought in or removed from the data center without authorization.
A review of physical security is conducted periodically for all facilities. This aims to satisfy all security requirements at all times.
After equipment reaches the end of its life, it is disposed of securely, with rigorous data and hardware disposal policies. During the disposal process, Microsoft personnel ensure that data is not available to untrusted parties. All data devices are either wiped (if possible) or physically destroyed in order to render the recovery of any information impossible.
All Microsoft Azure data centers are designed, built, and operated in a way that satisfies top industry standards, such as ISO 27001, HIPAA, FedRAMP, SOC 1, and SOC 2, to name a few. In many cases, specific region or country standards are followed as well, such as Australia IRAP, UK GCloud, and Singapore MTCS.
As an added precaution, all data inside any Microsoft Azure data center is encrypted at rest. Even if someone managed to get their hands on disks with customers' data, which is virtually impossible with all the security measures, it would take an enormous effort (both from a financial and time perspective) to decrypt any of the data.
But in the cloud era, network security is equally, if not more, important than physical security. Most services are accessed over the internet, and even isolated services depend on the network layer. So, next, we need to take a look at Azure network architecture.
Networking in Azure can be separated into two parts: managed by Microsoft and managed by us. In this section, we will discuss the part of networking managed by Microsoft. It's important to understand the architecture, reliability, and security setup of this part to provide more context once we move to parts of network security that we need to manage.
As with Azure data centers generally, the Azure network follows industry standards with three distinct models/layers:
All three models use distinct hardware to completely separate all the layers. The core layer uses data center routers, the distribution layer uses access routers and L2 aggregation (this layer separates L3 routing from L2 switching), and the access layer uses L2 switches.
- First level: Aggregates traffic
- Second level: Loops to incorporate redundancy
This approach allows for more flexibility and better port scaling. Another benefit of this approach is that L2 and L3 are wholly separated, which allows for the use of distinct hardware for each layer in the network. Distinct hardware minimizes the chances of a fault in one layer affecting another one. The use of trunks allows for resource sharing for better connectivity. The network inside an Azure data center is distributed into clusters for better control, scaling, and fault tolerance.
In terms of network topology, Azure data centers contain the following elements:
- Edge network: An edge network represents a separation point between the Microsoft network and other networks (such as the internet or corporate networks). It is responsible for providing internet connectivity and ExpressRoute peering into Azure (covered in Chapter 4, Azure Network Security).
- Wide area network: The wide area network is Microsoft's intelligent backbone. It covers the entire globe and provides connectivity between Azure regions.
- Regional gateways network: A regional gateway is a point of aggregation for Azure regions and applies to all data centers within the region. It provides connectivity between data centers within the Azure region and enables connectivity with other regions.
- Data center network: A data center network enables connectivity between data centers and enables communication between servers within the data center. The data center network is based on a modified version of the Clos network. The Clos network uses the principle of multistage circuit-switching. The network is separated into three stages – ingress, middle, and egress. Each stage contains multiple switches and uses an r-way shuffle between stages. When a call is made, it enters the ingress switch and from there it can be routed to any available middle switch, and from the middle switch to any available egress switch. As the number of devices (switches) in use is huge, it minimizes the chance of hardware failure. All devices are situated at different locations with independent power and cooling, so an environmental failure has a minimal impact as well.
Azure networking is built upon highly redundant infrastructure in each Azure data center. Implemented redundancy is need plus one (N+1) or better, with full failover features within, and between, Azure data centers. Full failover tolerance ensures constant network and service availability. From the outside, Azure data centers are connected by dedicated, high-bandwidth network circuits redundantly that connect properties with over 1,200 Internet Service Providers (ISPs) on a global level. Edge capacity across the network is over 2,000 Gbps, which presents an enormous network potential.
Distributed Denial of Service (DDoS) is becoming a huge issue in terms of service availability. As the number of cloud services increases, DDoS attacks become more targeted and sophisticated. With the help of geographical distribution and quick detection, Microsoft can help you mitigate these DDoS attacks and minimize the impact. Let's take a look at this in more detail.
Azure infrastructure availability
Azure is designed, built, and operated to deliver highly available and reliable infrastructure. Improvements are constantly implemented to increase availability and reliability, along with efficiency and scalability. Delivery of a more secure and trusted cloud is always a priority.
Uninterruptible power supplies and vast banks of batteries ensure that the flow of electricity stays uninterrupted in case of short-term power disruptions. In the case of long-term power disruptions, emergency generators can provide backup power for days. Emergency power generators are used in cases of extended power outages or planned maintenance. In cases of natural disasters, when the external power supply is unavailable for long periods, each Azure data center has fuel reserves on-site.
Robust and high-speed, fiber optic networks connect data centers to major hubs. It's important that, along with connections through major hubs, data centers are connected directly. Everything is distributed into nodes, which host workloads closer to users to reduce latency, provide geo-redundancy, and increase resiliency.
Data in Azure can be placed in two separate regions: primary and secondary regions. A customer can choose where the primary and secondary regions will be. The secondary region is a backup site. In each region, primary and secondary, Azure keeps three healthy copies of your data at all times. This means that six copies of the data are available at any time. If any data copy becomes unavailable at any time, it's immediately declared invalid, a new copy is created, and the old one is destroyed.
Microsoft ensures high availability and reliability through constant monitoring, incident response, and service support. Each Azure data center operates 24/7/365 to ensure that everything is running, and all services are available at all times. Of course, available at all times is a goal that, ultimately, is impossible to reach. Many circumstances can impact uptime, and sometimes it's impossible to control all of them. Realistically, the aim is to achieve the best possible Service Level Agreement (SLA) so as to ensure that potential downtime is limited as far as possible. The SLA can vary based on a number of factors and is different per service and configuration. If we take into account all the factors we can control, the best SLA we can achieve would be 99.99%, also known as four nines.
Closely connected to infrastructure availability is infrastructure integrity. Integrity affects the availability terms of deployment, where all steps must be verified from different perspectives. New deployments must not cause any downtime or affect existing services in any way.
Azure infrastructure integrity
All software components installed in the Azure environment are custom built. This, of course, refers to software installed and managed by Microsoft as part of Azure Service Fabric. Custom software is built using Microsoft's Security Development Lifecycle (SDL) process, including operating system images and SQL databases. All software deployment is conducted as part of the strictly defined change management and release management process. All nodes and fabric controllers use customized versions of Windows Server 2019. The installation of any unauthorized software is not allowed.
VMs running in Azure are grouped into clusters. Each cluster contains around 1,000 VMs. All VMs are managed by the Fabric Controller (FC). The FC is scaled out and redundant. Each FC is responsible for the life cycle management of applications running in its own cluster. This includes the provisioning and monitoring of hardware in that cluster. If any server fails, the FC automatically rebuilds a new instance of that server.
Each Azure software component undergoes a build process (as part of the release management process) that includes virus scans using endpoint protection anti-virus tools. As each software component undergoes this process, nothing goes to production without a clean-virus scan. During the release management process, all components go through a build process. During this process, an anti-virus scan is performed. Each virus scan creates a log in the build directory and, if any issues are detected, the process for this component is frozen. Any software components for which the issue is detected undergo inspection by Microsoft security teams in order to detect the exact issue.
Azure is a closed and locked-down environment. All nodes and guest VMs have their default Windows administrator account disabled. No user accounts are created directly on any of the nodes or guest VMs as well. Administrators from Azure support can connect to them only with proper authorization to perform maintenance tasks and emergency repairs.
With all precautions taken to provide maximum availability and security, incidents may occur from time to time. To detect these issues and mitigate them as soon as possible, Microsoft implemented monitoring and incident management.
Azure infrastructure monitoring
All hardware, software, and network devices in Azure data centers are constantly reviewed and updated. Reviews and updates are performed mandatorily at least once a year, but additional reviews and updates are performed as needed. Any changes (to hardware, software, or the network) must go through the release management process and need to be developed, tested, and approved in development and test environments prior to release to production. In this process, all changes must be reviewed and approved by the Azure security and compliance team.
All Azure data centers use integrated deployment systems for the distribution and installation of security updates for all software provided by Microsoft. If third-party software is used, the customer or software manufacturer is responsible for security updates, depending on how the software is provided and used. For example, if third-party software is installed using Azure Marketplace, the manufacturer is responsible for providing updates. If the software is manually installed, then it depends on the specific software. For Microsoft software, a special team within Microsoft, named Microsoft Security Response Center, is responsible for monitoring and identifying any security incident 24/7/365. Furthermore, any incident must be resolved in the shortest possible time frame.
Vulnerability scanning is performed across the Azure infrastructure (servers, databases, and network) at least once every quarter. If there is a specific issue or incident, vulnerability scanning is performed more often. Microsoft performs penetration tests, but also hires independent consultants to perform penetration tests. This ensures that nothing goes undetected. Any security issues are addressed immediately in order to increase security and stop any exploit when the issue is detected.
In case of any security issue, Microsoft has incident management in place. In the event that Microsoft is aware of a security issue, it takes the following action:
- The customer is notified of the incident.
- An immediate investigation is started to provide detailed information regarding the security incident.
- Steps are taken to mitigate the effects and minimize the damage of the security incident.
Understanding Azure security foundations
Overall, we can see that with Microsoft Azure, the cloud can be very secure. But it's very important to understand the shared responsibility model as well. Just putting applications and data into the cloud doesn't make it secure. Microsoft provides certain parts of security and ensures that physical and network security is in place. Customers must assume part of the responsibility and ensure that the right measures are taken on their side as well.
For example, let's say we place our database and application in Microsoft Azure, but our application is vulnerable to SQL injection (still a very common data breach method). Can we blame Microsoft if our data is breached?
Let's be more extreme and say we publicly exposed the endpoint and forgot to put in place any kind of authentication. Is this Microsoft's responsibility?
If we look at the level of physical and network security that Microsoft provides in Azure data centers, not many organizations can say that they have the same level in their local data centers. More often than not, physical security is totally neglected. Server rooms are not secure, access is not controlled, and many times there is not even a dedicated server room, but just server racks in some corner or corridor. Even when a server room is under lock and key, no change of management is in place, and no one controls or reviews who is entering the server room and why. On the other hand, Microsoft implements top-level security in its data centers. Everything is under constant surveillance, and every access needs to be approved and reviewed. Even if something is missed, everything is still encrypted and additionally secured. In my experience, this is again something that most organizations don't bother with.
Similar things can be said about network security. In most organizations, almost all network security is gone after the firewall. Networks are usually unsegmented, no traffic control is in place inside the network, and so on. Routing and traffic forwarding are basic or non-existent. Microsoft Azure again addresses these problems very well and helps us have secure networks for our resources.
But even with all the components of security that Microsoft takes care of, this is only the beginning. Using Microsoft Azure, we can achieve better physical and network security than we could in local data centers, and we can concentrate on other things.
The shared responsibility model has different responsibilities for different cloud service models, and it's sometimes unclear what needs to be done. Luckily, even if it's not Microsoft's responsibility to address these parts of security, there are many security services available in Azure. Many of Azure's services have the single purpose of addressing security and helping us protect our data and resources in Azure data centers. Again, it does not stop there. Most of Azure's services have some sort of security features built-in, even when these services are not security-related. Microsoft takes security very seriously and enables us to secure our resources with many different tools.
The tools available vary from tools that help us to increase security by simply enabling a number of options, to tools that have lots of configuration options that help us design security, to tools that monitor our Azure resources and give us security recommendations that we need to implement. Some Azure tools use machine learning to help us detect security incidents in real time, or even before they happen.
This book will cover all aspects of Microsoft Azure security, from governance and identity, to network and data protection, to advanced tools. The final goal is to understand cloud security, to learn how to combine different tools to maximize security, and finally, to master Azure security!
The most important lesson in this chapter is to understand the shared responsibility model in Azure. Microsoft takes care of some parts of security, especially in terms of physical security, but we need to take care of the rest.
With Azure networking, integrity, availability, and monitoring, we don't have any influence and can't change anything (at least in the sections we discussed here). However, they are important to understand as we can apply a lot of things in the parts of security that we can manage. They will also provide more context and help us to better understand the complete security setup in Azure.
In the next chapter, we will move on to identity, which is one of the most important pillars of security. In Azure, identity is even more important, as most services are managed and accessible over the internet. Therefore, we need to take additional steps to make identity and access secure and bulletproof.
As we conclude, here is a list of questions for you to test your knowledge regarding this chapter's material. You will find the answers in the Assessments section of the Appendix:
- Whose responsibility is security in the cloud?
B. Cloud provider's
C. Responsibility is shared
- According to the shared responsibility model, who is responsible for the security of physical hosts?
B. Cloud provider
- According to the shared responsibility model, who is responsible for the physical network?
B. Cloud provider
C. Depends on the service model
- According to the shared responsibility model, who is responsible for network controls?
B. Cloud provider
C. Depends on the service model
- According to the shared responsibility model, who is responsible for data governance?
B. Cloud provider
C. Depends on the service model
- Which architecture is used for Azure networking?
B. Quantum 10 (Q10)
C. Both, but DLA is replacing Q10
D. Both, but Q10 is replacing DLA
- In case of a security incident, what is the first step?
A. Immediate investigation
C. Customer is notified