Troubleshooting vSphere Storage

By Mike Preston
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

Virtualization has created a new role within IT departments everywhere; the vSphere administrator. vSphere administrators have long been managing more than just the hypervisor, they have quickly had to adapt to become a ‘jack of all trades’ in organizations. More and more tier 1 workloads are being virtualized, making the infrastructure underneath them all that more important. Due to this, along with the holistic nature of vSphere, administrators are forced to have the know-how on what to do when problems occur.

This practical, easy-to-understand guide will give the vSphere administrator the knowledge and skill set they need in order to identify, troubleshoot, and solve issues that relate to storage visibility, storage performance, and storage capacity in a vSphere environment.

This book will first give you the fundamental background knowledge of storage and virtualization. From there, you will explore the tools and techniques that you can use to troubleshoot common storage issues in today’s data centers.


You will learn the steps to take when storage seems slow, or there is limited availability of storage. The book will go over the most common storage transport such as Fibre Channel, iSCSI, and NFS, and explain what to do when you can’t see your storage, where to look when your storage is experiencing performance issues, and how to react when you reach capacity. You will also learn about the tools that ESXi contains to help you with this, and how to identify key issues within the many vSphere logfiles.

Publication date:
November 2013
Publisher
Packt
Pages
150
ISBN
9781782172062

 

Chapter 1. Understanding vSphere Storage Concepts and Methodologies

Before jumping into the details of how to troubleshoot vSphere Storage, it's best to understand the basics of how storage operates in a virtualized environment. On the whole, ESXi is a very user-friendly, easy-to-use hypervisor. However, when we look at it in terms of troubleshooting and storage, there are a lot ofcomplex scenarios and key pieces of information that we need to know in order to resolve issues as they occur.

This chapter will help us to better understand the fundamentals of how vSphere and ESXi attach to and utilize various types of storage and show us how we can identify our datastores, storage paths, and LUNs within our environment. We will also learn about the Pluggable Storage Architecture (PSA) and take a broader look at how an application running in a virtual machine accesses storage.

The topics that we'll be covering in this chapter are:

  • Storage virtualization

  • Supported filesystems

  • Storage naming

  • The vSphere Pluggable Storage Architecture

  • An I/O request—from start to finish

 

Storage virtualization


ESXi presents its storage to a VM using host-level storage virtualization techniques which essentially provide an abstraction layer between the actual physical storage, whether that is attached via a Storage Area Network (SAN), an Ethernet network or locally installed, and the virtual machines consuming the storage. This abstraction layer consists of many different components all working together to simulate that of a physical disk inside a virtual machine.

When a virtual machine is created, it will normally have at least one virtual disk assigned to it. When a virtual disk is assigned to a VM, a piece of virtual hardware called a virtual storage adapter is created in order to facilitate the communication between the VM and its underlying virtual hard disk (vmdk). The type of virtual storage adapter that is used greatly depends on the Guest Operating System setting that has been chosen for that specific VM (see the following table). This newly created SCSI adapter provides the interface between the OS and the VMkernel module on the ESXi host. The VMkernel module then locates the target file within the volume, maps the blocks from the virtual disk to the physical device, forwards the request through the Pluggable Storage Architecture, and finally queues the appropriate adapter on the ESXi host depending on the type of storage present (iSCSI NIC/Hardware Initiator, Fibre Channel Host Bus Adapters (FC HBA), NFS – NIC, or Fibre Channel over Ethernet (FCoE NIC/CNA)).

The following table outlines the various virtual SCSI adapters available:

Virtual SCSI adapter

Supported VM hardware version

Description

OS support

BusLogic Parallel

4,7,8,9,10

Emulates the BusLogic Parallel SCSI adapter. Mainly available for older operating systems.

Default for most Linux operating systems.

LSI Logic Parallel

4,7,8,9,10

Emulates the LSI Logic Parallel SCSI adapter. Supported by most new operating systems.

Default for Windows 2003/2003 R2.

LSI Logic SAS

7,8,9,10

Emulates the LSI Logic SAS adapter. Supported on most new operating systems.

Default for Windows 2008/2008 R2/2012.

VMware Paravirtual SCSI (PVSCSI)

7,8,9,10

Purposely built to provide high throughput with a lower CPU overhead. Supported on select newer operating systems.

No defaults, but is supported with Windows 2003+, SUSE 11+, Ubuntu 10.04+, and RHEL6+.

 

Supported filesystems


VMware ESXi supports a couple of different filesystems to use as virtual machine storage; Virtual Machine File System (VMFS) and Network File System (NFS).

VMFS

One of the most common ESXi storage configurations utilizes a purpose-built, high-performance clustered filesystem called VMFS. VMFS is a distributed storage architecture that facilitates concurrent read and write access from multiple ESXi hosts. Any supported SCSI-based block device, whether it is local, Fibre Channel, or network attached may be formatted as a VMFS datastore. See the following table for more information on the various vSphere supported storage protocols.

NFS

NFS, like VMFS, is also a distributed file system and has been around for nearly 20 years. NFS, however, is strictly network attached and utilizes Remote Procedure Call (RPC) in order to access remote files just as if they were stored locally. vSphere, as it stands today supports NFSv3 over TCP/IP, allowing the ESXi host to mount the NFS volume and use it for any storage needs, including storage for virtual machines. NFS does not contain a VMFS partition. When utilizing NFS, the NAS storage array handles the underlying filesystem assignment and shares in which ESXi simply attaches to as a mount point.

Raw disk

Although not technically a filesystem, vSphere also supports storing virtual machine guest files on a raw disk. This is configured by selecting Raw Device Mapping when adding a new virtual disk to a VM. In general, this allows a guest OS to utilize its preferred filesystem directly on the SAN. A Raw Device Mapping (RDM) may be mounted in a couple of different compatibility modes: physical or virtual. In physical mode, all commands except for REPORT LUNS are sent directly to the storage device. REPORT LUNS is masked in order to allow the VMkernel to isolate the LUN from the virtual machine. In virtual mode, only read and write commands are sent directly to the storage device while the VMkernel handles all other commands from the virtual machine. Virtual mode allows you to take advantage of many of vSphere's features such as file locking and snapshotting whereas physical mode does not.

The following table explains the supported storage connections in vSphere:

 

Fibre Channel

FCoE

iSCSI

NFS

Description

Remote blocks are accessed by encapsulating SCSI commands and data into FC frames and transmitted over the FC network.

Remote blocks are accessed by encapsulating SCSI commands and data into Ethernet frames. FCoE contains many of the same characteristics as Fibre Channel except for Ethernet transport.

Remote blocks are accessed by encapsulating SCSI commands and data into TCP/IP packets and transmitted over the Ethernet network.

ESXi hosts access metadata and files located on the NFS server by utilizing file devices that are presented over a network.

Filesystem support

VMFS (block)

VMFS (block)

VMFS (block)

NFS (file)

Interface

Requires a dedicated Host Bus Adapter (HBA).

Requires either a hardware converged network adapter or NIC that supports FCoE capabilities in conjunction with the built-in software FCoE initiator.

Requires either a dependent or independent hardware iSCSI initiator or a NIC with iSCSI capabilities utilizing the built-in software iSCSI initiator and a VMkernel port.

Requires a NIC and the use of a VMkernel port.

Load Balancing/Failover

Uses VMware's Pluggable Storage Architecture to provide standard path selections and failover mechanisms.

Utilizes VMware's Pluggable Storage Architecture as well as the built-in iSCSI binding functionality.

Due to the nature of NFS implementing a single session, there is no load balancing available. Aggregate bandwidth can be achieved by manually accessing the NFS server across different paths. Failover can be configured only in an active/standby type configuration.

Security

Utilizes zoning between the hosts and the FC targets to isolate storage devices from hosts.

Utilizes Challenge Handshake Authentication Protocol (CHAP) to allow different hosts to see different LUNs.

Depends on the NFS storage device. Most implement an access control list (ACL) type deployment to allow hosts to see certain NFS exports.

 

Storage naming


In order to begin troubleshooting vSphere Storage, we need to be aware of how vSphere identifies and names the storage devices, LUNs, and paths available to our hosts. During the process of troubleshooting of vSphere Storage, there are a lot of situations where we need to provide the identifier of a storage device or path in order to obtain more information about the issue. Due to the uniqueness of these identifiers, ESXi will often use them when logging issues to syslog.

Viewing device identifiers

We are able to view device identifiers in a couple of different places; within the vSphere Client and within the ESXi Shell. Let us have a look at each in turn.

Within the vSphere Client

We can view the device identifiers within the vSphere Client by performing the following steps:

  1. Click on the Configuration tab of the host whose storage you wish to view.

  2. Click on the Storage section under Hardware.

  3. Switch to the Devices view and right-click on the header bar to add and remove desired columns if needed.

    Device identifiers from the vSphere Client

Within ESXi Shell

The following command will give us similar information as to what we see in the vSphere Client and should return similar information to that of the following screenshot:

esxcfg-scsidevs –c

Device identifiers from within the vSphere CLI

The many ways vSphere identifies storage

As shown in the previous two screenshots, we can see that there are three different identifiers as it pertains to storage naming: friendly names, identifiers, and runtime names.

Friendly names

Friendly names are generated by the host and can be modified and defined by the administrator.

Identifiers

Identifiers are not user definable due to the sheer fact that they must be unique and persistent in the case of a host reboot. Identifiers are displayed in one of many different formats which are derived depending on the storage subsystem presenting the device. In the previous two screenshots, you can see a variety of identifiers are used.

NAA identifiers

A large majority of storage devices return NAA identifiers which all begin with "naa.". An NAA identifier is often compared to that of a MAC address on a NIC as it is defined by certain standards and is always unique to the device being presented.

T10 identifiers

Another type of identifier shown is called a T10 identifier and always begins with "t10.". Normally, T10 identifiers are associated with an iSCSI array; however, it could be returned from any SCSI device. T10 identifiers are also governed by standards and like NAA identifiers, should always be unique.

IQN identifiers

Another identifier type which is solely used on iSCSI arrays is an iSCSI Qualified Name (IQN). IQNs are normally user configurable on the iSCSI arrays which in turn does not guarantee uniqueness on a global scale, but we should always ensure we have uniqueness within our environment. IQNs will always begin with "iqn." and just like NAA and T10 identifiers, must be persistent across reboots. Even if your iSCSI array is using IQN, there are times when it will return a T10 identifier, or a mixture of T10 and IQN identifiers.

MPX identifiers

The last type of identifier we can see in the previous two screenshots is an MPX identifier. MPX (VMware Multipath X Device) identifiers are generated by the ESXi host when the device does not return a naa, T10, or IQN identifier, and always begin with "mpx.". Unlike the other industry standard identifiers, MPX is not globally unique and is not persistent during a reboot. Normally, MPX identifiers are only seen on devices such as a CD or DVD ROM as they usually do not respond with any industry standard identifier.

Runtime names

Runtime names basically describe the first path to the device as assigned by the host. Although these usually don't change, there is no guarantee that they will persist across reboots since we cannot guarantee that a certain path to a storage device will always be active. Runtime names are constructed using the format shown in the following table:

Format

Explanation

vmhba(N)

N will be the physical storage adapter in the ESXi host.

C(N)

It describes the channel number.

T(N)

It describes the target number as decided by the ESXi host. These are not guaranteed to be unique between hosts nor are they persistent across reboots.

L(N)

The LUN number as defined by the storage system.

As you can conclude from the above description, the device described in the previous two screenshots with the identifier naa.600508b4000e21340001400000260000 exists on vmhba1, channel 0, target 0, and LUN 8, and therefore has a runtime name of vmhba1:C0:T0:L8.

Since friendly names are user definable and runtime names are not persistent across reboots or rescans, we will normally use the naa, t10, or IQN identifier when accessing and troubleshooting storage. It's the only form of storage naming that provides us the persistence and uniqueness that we need to ensure we are dealing with the proper datastore or path.

 

The vSphere Pluggable Storage Architecture


The vSphere Pluggable Storage Architecture is essentially a collection of plugins that reside inside the VMkernel layer of an ESXi host. The following figure shows a graphical representation of all the components of the PSA. The top-level plugin in the PSA is the Multipathing Plugin (MPP). The MPP defines how vSphere will manage and access storage including load balancing, path selection, and failover. The MPP itself can be provided by the storage vendor (IE EMC PowerPath) or you may also use the VMware provided Native Multipathing Plugin (NMP).

So essentially, the VMware provided NMP is in itself a MPP. The NMP is loaded by default for all storage devices, however, it can be overridden and replaced by installing a third-party MPP. Within each MPP, including the VMware NMP are two subplugins; Storage Array Type Plugin (SATP) and Path Selection Plugin (PSP). The SATP handles the details about path failover, whereas the PSP handles the details around load balancing and which physical path to use to issue an I/O request.

The VMware Pluggable Storage Architecture

Confused yet? I know the PSA is a lot to take in but it is essential to understand when you are troubleshooting storage issues. Let's have a look at each individual plugin included in the default VMware NMP in a little more detail to better understand the role it plays.

Pluggable Storage Architecture (PSA) roles and commands

The PSA performs two essential tasks as it pertains to storage:

  • Discover which storage devices are available on a host

  • Assign predefined claim rules associated with an MPP to take control of the storage device. Claim rules are explained in more detail in Chapter 3, Troubleshooting Storage Visibility.

In order to view a list of the PSA plugins, we use the storage core namespace of the esxcli command:

esxcli storage core plugin list

Multipathing Plugin – the VMware Native Multipathing Plugin roles and commands

The NMP/MPP performs the following functions:

  • The MPP claims a physical path to the device, that is, SATP

  • NMP comes with its own set of claim rules that associate certain SATP with a PSP

  • Exports a logical device to the physical path contained in the PSP

To list devices controlled by the NMP with their respective SATP and PSP information, use the storage nmp namespace of esxcli, as outlined:

esxcli storage nmp device list

Storage Array Type Plugin roles and commands

The SATP plugin, which is a subplugin of the overall MPP, performs the following functions:

  • Monitors the state of paths to the physical storage system

  • Determines when a physical path is to be declared failed or down

  • Handles the switching of physical paths after a path failure has occurred

  • VMware provides a number of SATP plugins depending on which supported storage array is being used and also some generic active-active/active-passive SATP's for unknown storage arrays

To list the currently loaded SATP plugins along with their default PSP information, run the storage nmp namespace with esxcli.

esxcli storage nmp satp list

To change the default PSP associated with a given SATP, you can use the esxcli storage nmp satp set –b <boottime> -P <Default PSP> -s <SATP> command similar to the one shown in following screenshot:

Associate a default PSP with a SATP via esxcli

Path Selection Plugin roles and commands

The PSP, which is a subplugin of the overall MPP, provides the PSA with the following functionality:

  • Responsible for choosing a path to issue an I/O request.

  • Differs from SATP in terms that the PSP is a load balancing mechanism and deals with only active paths. The SATP determines which paths are active/standby/failed.

  • VMware provides three default PSP plugins; Fixed, Most Recently Used, and Round Robin.

  • The VMware NMP will select a default PSP based on which SATP plugin has been loaded for the storage array.

To list all of the available PSP plugins, you can use the storage nmp psp namespace of esxcli as shown:

esxcli storage nmp psp list

More information in regards to each of the default policies that VMware provides is listed in the following table:

Policy

Explanation

Use

VMW_PSP_FIXED

(Fixed)

Host uses a designated preferred path if configured; otherwise it uses the first available path at boot time. The host will failover to other paths if preferred path is down and will return to the initial preferred path when connection is restored

Default policy for most active-active arrays.

VMW_PSP_MRU

(Most Recently Used)

Host will select the path that was used most recently. Upon failover, the host will move to another path. When the connection is restored, it will not revert back to the initial path.

Default policy for most active-passive arrays.

VMW_PSP_RR

(Round Robin)

The host will cycle IOPs through all active paths on active-passive arrays and all paths on active-active arrays.

Default for a number of active-active and active-passive arrays.

To list certain configuration of the different PSP's on certain devices, you can use the esxcli storage nmp psp <PSP Namespace> deviceconfig get –d <device identifier command similar to the one shown in the following screenshot. On the flip side, you can set certain parameters by replacing get with set.

Retrieving configuration from a devices PSP via esxcli

Although we have used the ESXi Shell to obtain all of the information mentioned previously, we should note that it is possible to retrieve and change some of the information from within the vSphere Client as well. Most of these operations are done in the Storage section of the Configuration tab of a host.

 

An I/O request – from start to finish


Now that we have a general understanding of how ESXi presents storage to a virtual machine and handles load balancing and failover, let's have a look at an I/O request from start to finish. The following figure shows a graphical representation of the following steps:

  • The VM issues a SCSI request to its respective virtual disk.

  • Drivers from within the guest OS communicate with the virtual storage adapters.

  • The virtual storage adapter forwards the command to the VMkernel where the PSA takes over.

    • The PSA loads the specific MPP (in our case the NMP) depending on the logical device holding the virtual machines disk.

    • The NMP calls the associated PSP for the logical device

    • The PSP selects the appropriate path to send the I/O down while taking into consideration any load balancing techniques. The I/O is then queued to the hardware/software initiator, CNA, or HBA depending on the storage transport being used.

    • If the previous step fails, the NMP calls the appropriate SATP to process error codes and mark paths inactive or failed, and then the previous step is repeated.

  • The hardware/software initiator, CNA, or FC HBA transforms the I/O request into the proper form depending on the storage transport (iSCSI, FC, or FCoE) and sends the request as per the PSAs instructions.

    I/O flow from start to finish

 

Summary


We should now have a basic understanding of the common storage concepts, terms, methodologies, and transports that vSphere uses. We discussed the various types of storage that vSphere supports, including VMFS and NFS along with the many transports it utilizes to access these such as Fibre Channel and iSCSI. We went over how to identify storage LUNS, devices, and paths and have a basic understanding of the Pluggable Storage Architecture and how data flows from a virtualized application right through to its underlying storage.

In the next chapter, we will look at developing a proper troubleshooting methodology as well as review some of the most commonly used tools that can help us when troubleshooting vSphere storage.

About the Author

  • Mike Preston

    Mike Preston is an IT professional and an overall tech enthusiast living in Ontario, Canada. He has held all sorts of IT titles over the last 15 years including Network Technician, Systems Administrator, Programmer Analyst, Web Developer, and Systems Engineer in all sorts of different verticals, from sales to consulting. Currently, he is working as a Systems Analyst supporting the education market near his home in Belleville, Ontario. Mike has always had an intense passion for sharing his skills, solutions, and work with various online communities, most recently focusing on the virtualization communities. He is an avid blogger at blog.mwpreston.net and participates in many discussions on Twitter (@mwpreston). It's his passion for sharing within the virtualization community which has led to Mike receiving the vExpert award for 2012 and 2013. Mike has presented at VMworld, VMUGs, and various virtualization conferences on numerous times both as a customer and an overall evangelist and has published different whitepapers and articles for various tech websites. His commitment to giving back to the community has resulted in his most recent venture of becoming a Toronto VMUG co-leader. He is a VMware Certified Professional in Datacenter Virtualization on both Version 4 and 5 of vSphere and is currently pursuing his VCAP5-DCA, which he hopes to accomplish by 2014.

    Browse publications by this author
Book Title
Access this book, plus 7,500 other titles for FREE
Start FREE trial