Introduction to Azure VMware Solution
Azure VMware Solution (AVS) is a first-party Microsoft Azure service developed in conjunction with VMware that provides a familiar vSphere-based, single-tenant private cloud on Azure that is like the one used by VMware. The VMware technology stack consists of the following components: vSphere, NSX-T, vSAN, and HCX. AVS is installed on a dedicated infrastructure in Azure data centers and runs natively on that infrastructure. In comparison to existing on-premises VMware infrastructures, AVS provides a consistent and well-known user experience. Customers may deploy an AVS environment in a matter of hours and migrate Virtual Machine (VM) resources in a matter of minutes. Microsoft supplies all the networking, storage, management, and support services that are required.
The following diagram depicts connectivity between your private cloud (on-premises infrastructure) and Microsoft Azure via an ExpressRoute running your AVS private cloud, as well as other Azure-native services:
Figure 1.1 – Connectivity relationship between your private clouds and AVS VNets
In this chapter, we’re going to cover the following main topics:
- Network connectivity to AVS
- AVS high-level architecture
- Use cases for AVS in the enterprise
- Enterprise-scale for AVS
- Network and connectivity topologies
- Identity and access management
- Business continuity and disaster recovery
- Security, governance, and compliance
- Management and monitoring
Network connectivity to AVS
AVS provides a private cloud environment that can be accessed from both on-premises and Azure-based infrastructure resources. The connectivity is provided by utilizing Azure ExpressRoute, Virtual Private Network (VPN) connections, or Azure Virtual WAN.
However, to make these services available, specific network address ranges and firewall ports must be configured.
When a private cloud is deployed, private networks are formed for management, provisioning, and vMotion. These private networks will be used to connect to vCenter and NSX-T Manager, as well as to perform virtual machine vMotion and deployment. The private network must use a /22 CIDR notation. This /22 is only used for the management components and not for your workload segments. You will need additional networks for your workloads.
It is possible to link private clouds to on-premises systems using ExpressRoute Global Reach. It establishes direct connections between circuits at the Microsoft Enterprise Edge (MSEE). Your subscription must have a Virtual Network (VNet) with an ExpressRoute circuit to on-premises for the connection to work. The reason for this is that VNet gateways (ExpressRoute gateways) are unable to transfer traffic across circuits. This means that you can connect two circuits to the same gateway, but the traffic will not be transferred from one circuit to another.
Each AVS environment is its own ExpressRoute region (and, thus, its own virtual MSEE device), which allows you to connect Global Reach to the “local” peering location by creating a virtual MSEE device for each environment. The ability to connect several AVS instances in a single region to the same peering location is provided by this feature.
AVS hosts, clusters, and private clouds
AVS private clouds and clusters are constructed on top of a hyper-converged Azure infrastructure host. These hosts are dedicated bare metal. At the time of writing, the High-End (HE) hosts have 576 GB of RAM and dual Intel 18 Core 2.3 GHz CPUs. In addition, the hosts are equipped with two vSAN disk groups, each of which contains a raw vSAN capacity layer of 15.36 TB (SSD) and a 3.2 TB (NVMe) vSAN cache tier. See the following hardware and software configurations:
AVS Software Specification
ESXi – 7.0U3c Enterprise Plus.
vCenter – 7.0U3c Standard.
vSAN – 7.0U3c Enterprise.
NSX-T – 3.1.2 Datacenter.
HCX Enterprise is also available. Submit a Microsoft support ticket to get an upgrade.
Table 1.1 – AVS software specification
Figure 1.2 – AVS hardware SKUs
There is a minimum of 3 nodes per vSphere cluster, and a maximum of 16 nodes per vSphere cluster, 12 clusters per private cloud instance, and a maximum of 96 nodes per Azure private cloud instance. You can review the Microsoft documentation at this link for more information: https://docs.microsoft.com/en-us/azure/azure-vmware/concepts-private-clouds-clusters#clusters.
As you can see from the preceding information, you can scale your private cloud to meet your workload demands.
AVS high-level architecture
AVS provides a private cloud environment that can be accessed from both on-premises and Azure-based infrastructure. Connectivity includes services such as Azure ExpressRoute, VPN connections, and Azure Virtual WAN.
Specific network address ranges and firewall ports, on the other hand, are required for these services to be enabled.
A private cloud is deployed, and private networks are constructed for management, provisioning, and VM movement (vMotion).
These private networks will be used to connect to vCenter and NSX-T Manager, as well as for VM vMotion and deployment. You can review the Microsoft documentation at this link for more information: https://learn.microsoft.com/en-us/azure/azure-vmware/tutorial-network-checklist#routing-and-subnet-considerations. A connection between private clouds and on-premises settings is made possible through the usage of ExpressRoute Global Reach. Global Reach establishes direct connections between Azure ExpressRoute circuits at the MSEE level. An ExpressRoute circuit to on-premises is required for the connection, which is included in your subscription with a VNet. The reason for this is that VNet gateways (ExpressRoute gateways) are unable to transfer traffic between circuits. This implies that you can connect two circuits to the same gateway, but the traffic will not be transferred from one circuit to the other.
Each AVS environment is deployed with its own 10 GB ExpressRoute circuit (and, thus, its own virtual MSEE device), which allows you to connect Global Reach to the “local” peering location by creating a virtual MSEE device in each environment. It enables you to connect several AVS instances in a single region to the same peering site by using a VNet interface.
See the following high-level AVS networking overview:
Figure 1.3 – An overview of high-level AVS networking
The preceding diagram shows the logical connections between AVS and the customer’s on-premises data center. It also shows the connection between AVS and Azure. Global Reach is used to connect two or more ExpressRoute circuits.
Use cases for AVS in an enterprise
You can migrate your VMware workloads from your on-premises data center to AVS and integrate additional Azure services with ease, using the same VMware tools that you are already familiar with. However, while there are other advantages, we’ve identified the top five reasons why AVS is proving to be the most cost-effective path to the cloud for many enterprises.
Data center footprint deduction, consolidation, and retirement
The vSphere-based workloads can be migrated to AVS in a non-disruptive, automated, scalable, and highly available manner without having to change the underlying vSphere hypervisor.
Data center expansion based on demand
Customers are now able to increase their data center capacity in a seamless and elastic manner, while also adjusting their cost on demand for short periods of time. We see this kind of need in a logistic business, where customers need to increase their data center capacity for a period and then decrease that capacity once it is no longer needed.
Disaster recovery and business continuity
AVS can be used as a primary or secondary on-demand DR site for on-premises data center infrastructure by customers who require a backup data center in the cloud.
Speed and simplification of migration/hybrid cloud
AVS has proven to be one of the most efficient and straightforward methods of getting started on Azure without having to make any changes to your existing apps or servers.
AVS is very cost-effective
When it comes to running VMware apps on Windows Server and SQL Server, AVS is the most cost-effective option. If you use your on-premises data center effectively, you can save money by not having to purchase multiple licenses for both on-premises and cloud applications. When you migrate to AVS, you will receive 3 years of free Extended Security Updates (ESU) for Windows and SQL Server 2008/2008R2/2012.
Enterprise-scale for AVS
Enterprise-scale for AVS is a collection of open source templates of Azure Resource Manager and Bicep that can be used with AVS planning and deployment. You can think of it as a roadmap for how to build a scalable AVS for future growth. This open source solution gives you an example of how to set up Azure landing zone subscriptions for a scalable AVS. It also gives you an example of how to set up the subscriptions. The architecture and best practices of the Cloud Adoption Framework’s Azure landing zones are used in the implementation, with a focus on the design principles of a large-scale deployment.
If you want to make your landing zone more efficient, you should think about how to make it more scalable. It is important for your organization to follow this advice when it comes to making design decisions because this will help it to grow.
There are many ways for people to use AVS, and they all work well. It’s possible to use the enterprise-scale option for your AVS set to build a structure that works for you and puts your organization on a path to long-term growth.
To assist you with your AVS setup, enterprise-scale for AVS offers the following resources:
- Customizable environment variables that can be implemented using a modular method
- Helpful recommendations to assess the most important decisions
- A landing zone design that you can use for reference to set up your AVS deployment
- A deployment that includes the following:
- A reference architecture to deploy your AVS environment
- A reference architecture approved by Microsoft
Prerequisites for the implementation of the enterprise-scale landing zone for AVS
There are multiple design guidelines that you will need to go through when creating your landing zone for AVS. The following is a list of areas that you will need to focus on when creating an AVS enterprise-scale landing zone:
- Network and connectivity topology
- Identity and access management
- Business Continuity and Disaster Recovery (BCDR)
- Security, governance, and compliance
- Management and monitoring
- Platform automation
Let us dig a bit deeper into these design areas to provide you with some more detailed information.
Network and connectivity topologies
For both cloud-native and hybrid scenarios, implementing a VMware Software-Defined Data Center (SDDC) with the Azure cloud ecosystem has some unique design challenges to think about when planning for your deployment. Some of these challenges are outlined as follows:
- Hybrid connectivity: This is the connectivity between your on-premises environment and your AVS. This is where you will need to look at what connectivity method you are currently using to connect your on-premises data center to Azure if you already have a presence in Azure. If there is no existing connectivity make sure you understand what the options are (ExpressRoute, S2S VPN, or SDWAN). We will dive deeper into these areas in a later chapter.
- Reliability and performance: This is very important as you will need to have consistent and low latency for your workloads. You will also need to design for scalability for future growth.
- A zero-trust network security model: Security should be the heart of every solution that you implement in Azure, and AVS is no exception. You will need to plan for security for your network perimeter, and for traffic inspection for ingress and egress flows.
- Extensibility: Your network footprint should be easily extended without the need for a redesign. This is very important as your AVS needs grow.
- AVS without any connectivity:
Figure 1.4 – An overview of AVS deployment without any connectivity
- AVS with Global Reach enabled:
The preceding diagram shows a BGP traffic flow (blue dotted arrows) from AVS to the customer’s on-premises data center. BGP traffic will flow between both environments once Azure Global Reach is enabled.
- AVS with Global Reach enabled – BGP traffic flowing to Azure from AVS:
Figure 1.6 – The BGP traffic flow from AVS to Azure-native services through the customer MSEE
- AVS connection between AVS and Azure-native:
The preceding diagram shows the BGP traffic flow from AVS to Azure-native services through the customer’s ExpressRoute gateway. This connection is only to Azure services and not to the customer’s on-premises environment.
- Internet traffic flow from AVS via a vWAN:
Figure 1.8 – Internet traffic flow from AVS via a secure Azure Virtual WAN
- Internet traffic flow from AVS via an Azure Route Server and a Network Virtual Appliance (NVA):
- Internet traffic flow from AVS via the customer on-premises firewall:
Figure 1.10 – Internet traffic flow from AVS via the customer’s on-premises infrastructure
Identity and access management
There are different identity requirements for AVS based on how it’s set up in Azure. AVS comes with a built-in user called
cloudadmin in the new environment’s vCenter. This user has been given the CloudAdmin role, which gives them a lot of power in vCenter. It’s also possible to set up new roles in your AVS environment using the principle of least privilege:
- Active Directory Domain Services (AD DS): It is highly recommended to deploy an AD DS domain controller in your identity subscription in Azure. This will help with users’ authentication in Azure instead of this request being made back in the customer’s on-premises environment.
- Least-privilege roles: Allow only a small number of people to have the CloudAdmin role. When assigning users to AVS, use custom roles and as few permissions as possible.
- Resource-based access control: People who need to manage AVS should only have Role-Based Access Control (RBAC) permissions for the resource group where AVS is installed, and for delegated users who need to manage it.
- vSphere permissions: Only set up vSphere permissions with custom roles at the top level if you need to. It’s better to give permissions to the right VM folder or resource pool. In general, do not apply any kind of vSphere permissions at or above the level of the data center.
- Active Directory sites and services: Ensure that Active Directory sites and services are configured with the appropriate and respective client IP subnets to provide a better authentication experience when attempting to locate the nearest domain controller.
- Active Directory groups: When you set up groups in Active Directory, you can use RBAC to manage vCenter and NSX-T. You can make your own roles and assign them to Active Directory groups.
Business continuity and disaster recovery
It is important for an organization and its enterprise application workloads to meet their Recovery Time Objective (RTO) and Recovery Point Objective (RPO) goals. Effective BCDR design meets these needs at the platform level. To figure out how to build DR capabilities, you need to know what your platform needs.
Design considerations for AVS BC
Choose a backup solution that has been proven to work for VMware VMs, such as Microsoft Azure Backup Server (MABS) or from one of the backup service providers. Some of the backup solutions for AVS are listed as follows:
- When you set up MABS, make sure it is in the same Azure region as your AVS private cloud. This method saves money on traffic costs, makes it easier to manage, and keeps the primary/secondary topology the same.
- There are two ways to run MABS: you can run it as an Azure VM in your Azure-native environment, or you can run it on an Azure VM within your private cloud. It’s very important to put it outside of the AVS private cloud and into a VNet that has connectivity to AVS via ExpressRoute.
- To get help restoring from a backup for parts of the AVS platform, such as vCenter, NSX Manager, or HCX Manager, you will need to create an Azure support request.
- Dell Technologies
Design considerations for AVS DR
- Make sure that the business needs match up with the recovery time, capacity, and recovery point goals for your applications and VM tiers. To make sure you get what you want, plan and design accordingly. Use the right replication technology to do this. Technologies such as SQL always-on availability groups, VMware Site Recovery Manager (SRM), and Azure Site Recovery (ASR) are some ideal solutions to implement as part of your DR strategy.
- VMware SRM is a very good option to back up your AVS private cloud to a second AVS private cloud in case of a disaster, so you can keep your business running. Please note that VMware SRM is not included in your AVS subscription. It is an add-on that you will need to have a separate license for.
- ASR is another solution that you can use to back up your AVS private cloud to Azure IaaS.
- There are also partner solutions such as JetStream Software that you can use to implement your DR solution for AVS.
- Make sure you decide which of your AVS workloads needs to be protected if there is a DR situation Consider only protecting the things that are important to your business to keep the costs down.
- Make sure to have copies of your domain controllers in your secondary environment.
- Make sure both backend ExpressRoute circuits have ExpressRoute Global Reach turned on. This will make it possible for DR to happen between AVS private clouds in different Azure regions. These circuits connect the main private cloud to the secondary private cloud when DR solutions such as VMware SRM and VMware HCX are used.
Security, governance, and compliance
In this section, we will talk about how to make sure that AVS is safe to use and that you can manage it from start to finish. We will look at some specific design elements and give specific advice for the security, governance, and compliance of your AVS.
It is important to make sure that you have your security components planned out before you deploy any solution in Azure. AVS is no exception. In the following, we will look at some of the key factors to consider:
- Limits on permanent access: In the Azure resource group that hosts the AVS private cloud, the Contributor role is used. This role is used by the AVS service. To keep contributor rights from being misused, limit permanent access. Using a privileged account management tool can help you keep track of and limit how long highly privileged accounts are used.
- Centralized identity management: AVS gives cloud administrators and network administrators credentials that can be used to set up the VMware environment. They are visible to everyone who has RBAC access to the AVS.
If you want to restrict built-in
cloudadmin and network administrator users’ access to the VMware control plane, use the control plane RBAC features to properly control role and account access. Using least-privilege principles, make a lot of targeted identity objects such as users and groups. Limit access to the administrator accounts provided by AVS and set them up in a break-glass configuration. If you can’t use any other administrative account, use the built-in account instead.
Use the Cloudadmin account to connect Azure AD DS with the VMware vCenter and NSX-T control applications and the administrative identities for the domain services that are part of the cloud. Use users and groups from your domain to manage and operate your AVS. Don’t share your account. Customize vCenter roles and link them to AD DS groups so that you can control access to VMware control surfaces with fine-grained privilege level control, such as who can see what.
There are options in AVS that you can use to change and reset passwords for vCenter and NSX-T administrators. When you use the break-glass configuration, set up a regular rotation of these accounts, and rotate the accounts when you do.
- Storage space on your vSAN: You need to have sufficient free space on your vSAN to maintain your VMware Service-Level Agreement (SLA). A minimum of 25 percent free space on your vSAN is required by VMware.
- Host quota: If there are not enough host quotas, there could be delays of up to 7 days before you get more space for growth or DR. Make sure to think about growth and DR when you ask for the host quota, and check the environment’s growth and maximums on a regular basis to make sure there is enough time for expansion requests. Suppose a three-node AVS cluster needs three more nodes for DR If you need six nodes, ask for six hosts instead of just the primary three nodes. It doesn’t cost extra if you ask for a host quota.
- Access to the ESXi: There is limited access to the ESXi hosts. Some third-party software that needs access to the ESXi host might not work. Identify any AVS-supported third-party software in the source environment that needs access to the ESXi host from AVS. Make sure you know how to use the AVS support request process in the Azure portal when you need to get into the ESXi host.
- Country and/or industry regulatory compliance
- Data retention
- Corporate policies
Let us look at compliance in more detail:
- Microsoft Defender for Cloud monitoring: When you use Defender for Cloud, you can use the regulatory compliance view to make sure that you are meeting the required security and regulatory standards. Defender for Cloud workflow automation can be set up to keep an eye on how well you’re doing in terms of deviation from the required compliance policies.
- Workload VM backup compliance: Ensure your AVS guest VMs are being backed up. We mentioned earlier the importance of backing up your AVS in case of a disaster.
- Country- or industry-specific regulatory compliance: If you want to avoid costly legal action or fines, make sure your guest workloads for AVS follow local and industry-specific regulations. It’s important to know how the cloud-shared responsibility model works for different industrial or regional regulatory compliance.
- Data retention and residency requirements: AVS doesn’t allow you to keep or get data from clusters that are stored on the cloud. This means that when you delete a cluster, it stops all running workloads and components and also destroys all the cluster’s data and settings, such as public IP addresses. You will not be able to recover the deleted data.
- Corporate policy compliance: Keep an eye on the guest workloads in AVS to make sure they don’t break company rules and regulations. Use solutions such as Azure Arc-enabled servers and Azure Policy, or a similar third-party solution. Routinely check and manage AVS guest VMs and applications to make sure they meet the required internal and external regulations.
Management and monitoring
Creating an AVS with optimum management and monitoring capabilities will help you get the best out of the solution.
Look at the following tips for managing and monitoring your AVS platform:
- Keep track of the metrics that matter most to your operations teams and make alerts and dashboards that show them.
- vSAN storage space is limited, so you need to keep an eye on vSAN capacity. When you use vSAN storage, only use it for guest VM workloads. VMware requires you to have a minimum of 75 percent free space on the vSAN to maintain the SLA. It is also recommended that you use Azure Blob Storage to store your backups instead of using vSAN storage.
- A local identity provider is used by AVS. After you set up AVS, use a single administrative user account for the first configurations. Active Directory integration is highly recommended, since it provides a way to track the actions of each user.
AVS is a first-party Microsoft Azure service built in collaboration with VMware that delivers a familiar vSphere-based, single-tenant, private cloud on Azure. The VMware technology stack includes vSphere, NSX-T, vSAN, and HCX. AVS is deployed natively on dedicated infrastructure in Azure data centers. AVS provides a consistent, well-known user experience with existing on-premises VMware environments. Customers can deploy an AVS environment in just a few hours and quickly migrate VM resources. Microsoft provides all necessary networking, storage, management services, and support.
Throughout this chapter, we went over the critical design areas to help you design, implement, secure, and manage AVS.
Some of the critical design areas we covered were as follows:
- AVS overview
- Use cases for AVS
- Enterprise-scale for AVS
- Identity and access management
- Security, governance, and compliance
You should now understand what AVS is and the use cases for the solution.
In the next chapter, we will go deeper into enterprise-scale for AVS and the available guidelines and take a deeper look into the overall architecture.