You're reading from VMware Cloud on AWS Blueprint

Product typeBook

Published inFeb 2024

PublisherPackt

ISBN-139781803238197

Edition1st Edition

Concepts

Cloud Computing

Authors (3):

Oleg Ulyanov

Michael Schwartzman

Harsha Sanku

View More author details

Demystifying vSAN and host storage architecture

Let us explore the architecture of the storage subsystem within the VMware Cloud on AWS SDDC.

VMware vSAN overview

vSAN stands for virtual storage area network, an object-based storage solution leveraging locally attached physical drives. VMware has offered vSAN technology to the market for some time. Now, it’s a mature storage solution, well represented in Gartner’s magical quadrant and powering millions of customer workloads.

vSAN combines locally attached hard disks into a single, cluster-wide datastore, supporting simultaneous access from multiple ESXi hosts. All vSAN traffic traverses over a physical network using a dedicated vSAN VM kernel interface. A shared datastore across all hosts in a vSphere cluster enables usage of distinguished vSphere features, including live vMotion between hosts, vSphere HA to restart virtual machines from a failed host on the surviving host in the cluster, and DRS.

The vSAN distributed architecture with local storage fits perfectly into the cloud world, eliminating dependencies on external storage. Easily scalable with the addition of a new host, providing enterprise-level storage functionality (deduplication, compression, data-in-rest encryption, etc.), vSAN builds the foundation of VMware Cloud on AWS.

vSAN on VMware Cloud on AWS high-level architecture

While VMware Cloud on AWS leverages VMware vSAN in a way very similar to on-premises, there are still a number of distinguished architecture differences.

NVMe HDDs with a 4096 native physical sector size are used by all instance types in VMware Cloud on AWS.

At the moment, VMware Cloud on AWS features vSAN v1 (OSA) with a distinction between the caching and capacity tiers. Each host type features its own configuration of disk groups. With the release of vSphere 8, VMware brings a new vSAN architecture model, the so-called vSAN ESA. With the recent 1.24 SDDC release, vSAN ESA has been made available to selected customers under preview (https://vmc.techzone.vmware.com/resource/vsan-esa-vmware-cloud-aws-technical-deep-dive).

To facilitate logical separation between customer-managed and VMware-managed virtual machines, a single vSAN datastore is represented as two logical datastores. Only workload datastores are available for customer workloads. Both logical datastores share the same physical capacity and throughput.

Storage encryption

In VMware Cloud on AWS, vSAN encrypts all user data-a-rest. Encryption is automatically activated by default on every cluster deployed in your SDDC, and cannot be disabled. Additionally, with newer host types (i3en and i4i) vSAN traffic between hosts is encrypted as well (so-called data-in-transit encryption).

Figure 1.18 – vSAN cluster configuration and shared responsibility model

Storage policies

vSAN has been designed from the beginning to support major enterprise storage features. However, there is an important architectural difference between external storage and vSAN. When using an external storage, you will enable storage features on a physical Logical Unit Number (LUN) level and each LUN will be connected as a separate datastore, featuring different performance and availability patterns. With vSAN, all of these features are activated on a virtual machine or even on an individual VMDK level. To control performance and data availability for your workload, you assign different storage policies. vSAN storage policy management and policy monitoring are done from the vCenter using the vCenter Web Client. Customers control their configurations through virtual machine (VM) storage policies, also known as storage policy-based management (SPBM).

Each virtual machine or disk has a policy assigned to it, and the policy includes, among others, disk RAID and fault tolerance parameters and additional configurations such as disk stripers, I/O SLA, and encryption.

In the following diagram, you can see a graphical summary of the SPBM values:

Figure 1.19 – SBPM configuration graphical summary

Let’s go deeper and review the configuration available with vSAN storage policies.

Failures to Tolerate

Failures to Tolerate (FTT) defines the number of disk device or host failures a virtual machine can tolerate within a cluster. You can choose between 0 (no protection) and 3 (a cluster can tolerate a simultaneous failure of up to three hosts without affecting virtual machine workloads).

Note

VMware Cloud on AWS SLAs dictate a minimum FTT configuration to be eligible for SLA credit.

FTT is configured together with the appropriate Redundant Array of Independent Disks (RAID) policy according to the number of hosts in the cluster. Customers can choose a RAID policy optimized for either performance (mirroring) or capacity (erasure coding).

The FTT policy overhead per storage object depends on the selected FTT and RAID policy. For example, an FTT1 and RAID-1 policy creates two copies of a storage object. FTT2 and RAID-1 creates three copies of the storage object, and FTT3 and RAID-1 creates four copies of the storage object.

The following list describes the trade-offs between different RAID options:

RAID-1 uses Mirroring and requires more disk space overhead, but gives better write I/O performance, can survive a single host failure with FTT1, and is available from two hosts and above. RAID-1 is also used in conjunction with FTT2 for clusters larger than six hosts as an alternative to RAID-6 for improved write I/O performances, or FTT3 for clusters larger than seven hosts.
RAID-5 uses Erasure Coding and has less disk space overhead, which results in lower performance, but it can survive a single host failure and is available from four hosts and above.
RAID-6 is similar to RAID-5; however, it can survive two host failures (FTT2). This RAID type is available starting with six hosts.
RAID-0 (No Data Redundancy) uses no extra disk overhead and provides potentially the best performance (eliminating overhead to create redundant copies of the data), but cannot survive any failure.

Note

The RAID-0 policy is configurable but not eligible for SLA. If the host or disk fails, this can result in data loss.

The following table summarizes the RAID configuration, FTT policy options, and the minimal host count:

RAID Configuration	FTT Policy	Hosts Required
RAID-0 (No Data Redundancy)	0	1+
RAID-1 (Mirroring)	1	2+
RAID-5 (Erasure Coding)	1	4+
RAID-1 (Mirroring)	2	5+
RAID-6 (Erasure Coding)	2	6+
RAID-1 (Mirroring)	3	7+

Table 1.2 – Table summary of the RAID, FTT, and minimal host count options

Managed Storage Policy

VMware Cloud on AWS is provided as a service to customers. As a part of the service agreement, VMware commits to SLAs for customers using the service, including a VM uptime guarantee. Depending on the SDDC configuration, the SDDC is eligible for either a 99.9% (standard cluster) or 99.99% (stretched cluster with 6+ hosts) uptime availability commitment. To facilitate this strength of SLAs, VMware dictates a certain level of FTT configuration for customer workloads.

The following figure describes VM storage policies required for SLA-eligible workloads:

Figure 1.20 – Table of storage policy configuration for SLA eligibility

Information

A single-host SDDC is not covered by an SLA and should not be used for production workloads.

VMware Cloud on AWS introduced a new concept called Managed Storage Policy Profiles to help customers adhere to SLAs. Each vSphere cluster has a default storage policy managed by VMware and configured to adhere to SLAs requirements (e.g., FTT1, RAID-1 in a cluster with 3 ESXi hosts). As the cluster size changes, the Managed Storage Policy Profile is updated with the appropriate RAID and FTT configuration.

Customers can create their own policies, based on their needs, which may differ from the managed policies, and in that case, it will be the customer’s responsibility to adjust the policy parameters to meet the SLA compliance requirements. Customers can configure values not aligned with the SLA parameters – for instance, use RAID-1 FTT1 in a six-host cluster. In that case, customers will receive notifications that they have non-SLA-compliant storage objects in a cluster in their SDDC. Such a cluster is not entitled to receive any SLA credits. As in the following example, customers using non-compliant policies will receive periodic email notifications about it:

Figure 1.21 – Screenshot of a non-SLA-compliant storage object email notification

The Managed Storage Policy Profile offers customers an easy way to deploy workloads with SLA-compliant storage policies without overthinking the current cluster configuration.

Note

Do not modify the default storage policy in your SDDC. Your changes will be rewritten by the next invocation of SDDC monitoring. These changes might cause performance penalties for your workload causing vSAN to reapply changes within a short period of time. Instead, create a custom policy and assign it only to a subset of your virtual machines.

You have been reading a chapter from

VMware Cloud on AWS Blueprint

Published in: Feb 2024Publisher: PacktISBN-13: 9781803238197

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Oleg Ulyanov

Oleg Ulyanov is a Staff Cloud Architect with more than 15 years of experience. He is a Subject Matter Expert in VMware Hybrid Cloud, cloud migration, networking, and storage. He has experience as a VMware professional services architect, helping customers achieve their technical and business goals through IT transformation and migrating to VMware Hybrid Clouds. He holds various industry certificates, including VMware VCP, VCAP6/7-DCV, SNIA, and Microsoft.
Read more about Oleg Ulyanov

Michael Schwartzman

Michael Schwartzman, a Senior Azure Application Innovation Specialist at Microsoft, has over a decade of experience in cloud infrastructure, cloud security, and hybrid cloud solutions. Prior to his current role, Michael served as a Lead Cloud Solution Architect specializing in VMware Cloud on AWS. He has played a pivotal role in assisting Global ISVs with the development and sale of SaaS solutions on Azure. Additionally, Michael's broad expertise encompasses support for both digital natives and traditional enterprises, optimization of their cloud systems. His dedication to remaining at the forefront of the rapidly evolving tech landscape establishes him as a go-to expert for businesses seeking to leverage cutting-edge cloud technology.
Read more about Michael Schwartzman

Harsha Sanku

Harsha Sanku is a Solutions Architect at Amazon Web Services, specializing in AWS Hybrid Cloud and Edge Computing services. His expertise lies in Cloud Infrastructure including Networking & Security. He has been a VMware Cloud on AWS Specialist for the last four years. Harsha has a strong background in designing and implementing data center infrastructure and private clouds, with a particular focus on VMware technologies. In his current role at AWS, he collaborates with customers to migrate and modernize their hybrid cloud infrastructure, ensuring they remain competitive in the ever-evolving business and IT landscape.
Read more about Harsha Sanku

Personalised recommendations for you

Based on your interests and search pattern

Designing and Implementing Microsoft Azure Networking Solutions

Designing and Implementing Microsoft Azure Networking Solutions Exam Ref AZ-700 is an all-encompassing guide to the AZ-700 exam and contains all the information you need to succeed in the world of virtual networking with Azure. With this book, you will be fully prepared for the exam and the world of cloud networking.

BookAug 2023524 pages

Microsoft 365 Security, Compliance, and Identity Administration

The Microsoft 365 Security, Compliance, and Identity Administration is a comprehensive guide that helps you employ Microsoft 365's robust suite of features and empowers you to optimize your administrative tasks.

BookAug 2023630 pages

Zero Trust Overview and Playbook Introduction

Get started on Zero Trust with this step-by-step playbook and learn everything you need to know for a successful Zero Trust journey with tailored guidance for every role, covering strategy, operations, architecture, implementation, and measuring success. This book will become an indispensable reference for everyone in your organization.

BookOct 2023240 pages

The Self-Taught Cloud Computing Engineer

This self-study book helps you master multiple clouds, including AWS, Azure, and GCP, and serves as a roadmap to becoming a certified cloud computing expert. The book will guide you to develop a professional cloud career by helping you build a broad cloud knowledge base, developing hands-on cloud computing skills, and getting cloud certified.

BookSep 2023472 pages

Technology Operating Models for Cloud and Edge

This book will help you build and create ownership of a technology operating model, as well as connect your leadership with engineering and operations, keeping your internal and external customers in mind. It provides practical tips on why, where, and how to make the cloud and edge platform paradigm sing for you, your team, and your organization.

BookAug 2023228 pages

Azure Architecture Explained

Azure is the preferred platform to build mission-critical and secure apps. This book provides comprehensive coverage of essential Azure products, services, and solutions vital for every solution architect's success. Elevate your knowledge and master the critical components of Azure to excel in your role with Azure Architecture Explained.

BookSep 2023446 pages

Pentesting Active Directory and Windows-based Infrastructure

This practical guide helps you explore the pentesting of Microsoft infrastructure in detail, and enhances your offensive skillset by showing you the different ways to perform security assessment. This book will help blue teamers and IT engineers get up to speed with possible security issues they may encounter in their Windows environments.

BookNov 2023360 pages

Practical Ansible

In Practical Ansible, you'll work with the latest release of Ansible and learn to solve complex issues quickly with the help of task-oriented scenarios. You'll start by installing and configuring Ansible to automate monotonous and repetitive IT tasks and get to grips with concepts such as playbooks, inventories, plugins, collections, and network modules.

BookSep 2023420 pages

Windows 11 for Enterprise Administrators

Microsoft’s launch of Windows 11 is a step toward satisfying the enterprise administrator’s needs for better management and enhanced user experience customization. This book provides the enterprise administrator with the knowledge needed to fully utilize the advanced feature set of Windows 11 Enterprise.

BookOct 2023286 pages

The Linux DevOps Handbook

This book is for software and IT professionals seeking knowledge on Linux systems and DevOps practices. This book will provide you with guidance and tools to learn and gain proficiency in managing Linux-based infrastructures and knowledge of DevOps.

BookNov 2023428 pages2