In this chapter, we'll cover:
What is vCenter Operations Manager?
Benefits of troubleshooting with vC Ops
Benefits of capacity planning with vC Ops
Feature comparison of different versions
What is vCenter Operations Manager Suite?
Using vC Ops with other solutions
vCenter Operations Manager, also known as vC Ops , is a VMware product that allows IT administrators to monitor their virtual environments in the most efficient way. It also aids in design and capacity planning. vC Ops allows us administrators and IT managers to have visibility into our entire virtual infrastructure and goes beyond the simple alarms and performance charts offered in vCenter Server alone. It offers dashboards, alerts, and several detailed reports to help us better assess our environments. We can even monitor several different vCenter environments by simply configuring vC Ops to connect with any vCenter Server that we have in our environment. vC Ops is a vApp consisting of two virtual appliances that can be downloaded from the VMware website. It comes with a management plugin that's easily installed on the vSphere client. Alternatively we can browse the management site directly, if preferable. With vSphere Version 5.1 and above, we can also use vSphere Web Client to manage vC Ops and we can find embedded metrics within the summary pages for most of the objects in our virtual infrastructure.
As we can see from the following screenshot, the default dashboard offered in all the licensed versions of vC Ops above the Foundation edition holds a lot of information. We get an idea of the three major metrics, or badges, that vC Ops tells us about: Health, Risk, and Efficiency. Dashboards such as the one shown in the following screenshot can quickly give us an insight into the things that are happening in our environment and visually point out any errors or issues inside our environment that may have cause for more investigation.
Although vCenter Server comes with its own alarms and performance charts, vC Ops actually learns about our environment and reports alerts based on that. In fact, it's recommended that we let vC Ops run for a month after the initial installation before we start looking at the metrics and going through the reports. vCenter Server has several alarms that will show up on our vSphere Client, but we need to set these alarms with hard triggers. For example, in the following screenshot, we can see the vCenter 5.1 alarm triggers for Host memory usage in vSphere Web Client. It shows that if the memory usage for a physical host is above 90 percent for five minutes or longer, it will give us a warning. If the host memory usage is above 95 percent for five minutes or longer, it will give us a critical error.
For an alarm like this, a rigid or static trigger threshold may be appropriate. Memory should really not be at more than 95 percent utilization for too long. In that situation, we would want to add more memory to that host or perhaps vMotion VMs to another host if we have that option. However, what if we have an alarm triggered for CPU usage of a virtual machine? If this virtual machine consistently runs with high CPU usage because it's supposed to, vCenter will still tell us there is a critical error. Since vC Ops actually learns our environment, it will tell us that this is not an anomalous behavior, and we may not need to worry about it. Another example of when this is useful would be if a VM routinely runs scheduled tasks that cause CPU or memory utilization to be high for a brief time during the day. vCenter alarms would trigger everyday or every time this happens. vC Ops will learn this behavior, thereby reducing the barrage of alerts admins receive everyday. vC Ops will still tell us that CPU usage runs high via badge scores, such as Workload or Stress, so we don't have to worry about missing information either.
Other than immediate notifications and alarms found throughout the dashboards and reports, vC Ops also gives us visibility into longer-term issues and helps us know whether it's safe to add or remove VMs as well as physical resources to our environment.
As mentioned in the previous section, vC Ops actually understands our environment and reports anomalous behavior. This is not to say that if a VM is always using 100 percent of its storage, vC Ops will let you know that. It will also let you know how long it's been happening and the normal range for the VM. If we zoom in on the Workload badge, as shown in the following screenshot, we can see that it shows CPU usage, memory usage, disk I/O, and network I/O. The blue bar above each graph shows the normal range for each metric. If it were outside of that normal range, that would indicate anomalous behavior. This can be very helpful for troubleshooting because now we can dive in and see what's changed. From vCenter Server alone, we can see some historical data, and we can see real-time metrics, but without doing some pretty intense math, we won't know the normal range.
Another benefit you get with vC Ops, which you wouldn't necessarily see in the vCenter Server performance data, is that you can check whether VMs are undersized. An undersized VM is a virtual machine with less compute resources than it actually needs to perform properly. Again, this is not based solely on random peaks or bursts, but rather on historical and present data that has been run through algorithms, and vC Ops lets us know how much compute the VM should be assigned for it to work efficiently. So, for instance, if an application is running slow, or even slow at particular times in a day consistently, we would be able to open vC Ops, highlight the affected machine, and then go to the Planning tab. From here, we can see how much time this VM has been running without enough memory or CPU, for example, and it will also tell us how much additional resources it recommends.
One of the most interesting benefits is when you pair vC Ops with vCenter Configuration Manager. Again, let's say we have an application that's all of a sudden running slow or sporadically. If we open vC Ops and highlight the problematic VM again, we can find the recent events and tasks that have been performed on that VM. On this page, we can also see a graph with the performance of the VM. We can see where the performance spiked and also the events that correlated with that timeframe. Perhaps the event would be something similar to "VM RAM was changed from 10 GB to 2 GB". Even before we use vCenter Configuration Manager, we've narrowed down the issue to being lack of memory. Now, if we check with vCenter Configuration Manager, we may be able to see which administrator made that change, if that change was made from vCenter Configuration Manager.
The last benefit I'm going to bring up here, though there are certainly more, is the easy way to find relationships between VMs and other vSphere inventory objects. So, why is finding the relationship between a VM and what it's connected to important? Let's look at our previous two scenarios, where we had an application running slow on a VM. If we were to look at the VM, it could show that things are running slow, but we may be unable to find the reason immediately. However, if we look at the relationships between them, there may be a common denominator, such as a datastore or host, that is actually causing the issue. We may see that all the VMs on that particular datastore or host are running poorly, but if we correct the error at the root of the problem, we'll correct the issues on all the VMs connected to it. Using vCenter alone, we might have taken a lot longer to figure out this correlation, but because we can see all the relationships mapped out for us, it's easier to do a root cause analysis. See the following two screenshots for illustrations of how this might look. The following example is of the Overview section of the Environment tab. It shows all of the elements across the environment from the top down but highlights those related to, for example, a selected VM.
The following example, found under the Relationships section of the Environment tab, gives you a different view. It shows only the components that are in a direct relationship with the component you have selected in the left-hand side pane.
VMware used to have a solution called vCenter Capacity IQ. In early 2012, VMware stopped selling Capacity IQ and put all of the capacity planning features from Capacity IQ into vC Ops. We can upgrade vCenter Capacity IQ licenses to vC Ops licenses. vCenter itself doesn't really offer much by way of capacity planning. Obviously, we can look at how much space we have free on our datastores as well as how many compute resources we have free, but it would still be an estimate. With vC Ops, we take the guesswork out of it with the use of oversized VM reporting and what-if scenarios.
The great thing about vC Ops is that it actually has a lot of capacity management features in every view and dashboard. No matter where we've drilled in, we'll be able to see information about how much storage or compute resources are left. However, most of the capacity planning features can be easily found under the Planning and Analysis tabs. Here, we can find information, not only on how much storage or compute a VM is using, but also what the trends have been for future planning purposes. For instance, you might be able to see that a datastore has been losing about 2 GB of free space every week. If the datastore is 1 TB, vC Ops can estimate when we might run out of space.
Much like the undersized VM analysis discussed in the troubleshooting section, we also have data on which VMs are oversized. Many times applications and/or application owners will ask for outrageous amounts of CPU or memory. There are also cases where we perform physical to virtual migrations to convert a physical server to a virtual server and we just leave the original amount of compute resources even if it's not necessary. vCenter Server is never going to tell us that we've over allocated memory or CPU, for example, to a VM. vC Ops, however, will show us a full report on the VMs on which we can reclaim compute resources and how much can be reclaimed. This may seem like a small reclamation of resources, but let's say we have 100 VMs, and 25 of them are using an extra two vCPUs, then we can essentially reclaim 50 vCPUs as well as reduce CPU contention within our environment.
Probably the most interesting benefit of using vC Ops to do our capacity planning is the what-if scenarios. We can actually click on a link under the Planning tab to pull up a what-if scenario wizard. For example, with this wizard, we can manually input the number of VMs we want to add, and it will output how that will affect the current environment. We can also have it automatically input the variables using trending analytics for our environment. vC Ops will take a look at the average size of our VMs, analyze any pertinent historical data, and then tell us if we have enough resources to add a certain amount of VMs. This is an incredibly powerful tool that most normal admins would not be able to replicate through the use of a simple script with simple mathematics.
The main focus of this book is vC Ops although we do get into some of the other components of the suite. VMware has historically sold the current components of the suite separately. Recently though, they've decided it would be beneficial to bundle them in a suite. The components of the suite are as follows:
vCenter Operations Manager
vCenter Configuration Manager
vCenter Infrastructure Navigator
vCenter Chargeback Manager
For more information on these components, check the documentation given on
The versions of the vC Ops suite should not be confused with the versions of the vCloud suite. They are separate items.
It should be noted that VMware did announce during VMworld 2013 that vCenter Operations Manager Suite will be included in all vCloud Suite 5.5 editions. At the time of writing this book, vCloud Suite 5.5 and vSphere have not been released.
By looking at the chart, Foundation and Standard look the same, as do Advanced and Enterprise. However, that's not actually the case. Currently the Foundation license is included in every version of vCenter. It does not give us insights using historical data, it only reports in real time. However, it will store the historical data if we decide to upgrade the license at a later date. Foundation doesn't really offer much by way of capacity planning either. It's really more of just an extension of our current alerts that we get from vCenter.
As shown in the chart, Standard gives us all that is included in Foundation, as well as the capacity features. It is basically a full version of vCenter Operations Manager. The Advanced and Enterprise versions give us what is really more of a suite with the components mentioned earlier. The difference between these being that Enterprise gives us more OS- and application-level monitoring and Advanced really just gives us the VM-level monitoring.
There are several plugins or adapters to vC Ops that will extend our monitoring capabilities even further. One plugin that came out in the last year or so is vCenter Operations Manager for Horizon View. Through the use of vC Ops adapters, we can now monitor our Horizon View virtual desktop implementations. This gives us custom dashboards that give us a lot more insight than we've had before. If you've ever run a VDI environment, you must be well aware that the ability to pinpoint our problem areas quickly is a necessity. Other VMware solutions it will connect to are vCloud Director 1.5.0 and above.
There are several storage adapters for vC Ops as well. Among these are the EMC Smarts Adapter, EMC Symmetrix Adapter, EMC VNX Adapter, and NetApp Adapter. There are also several monitoring solutions that vC Ops can connect with to get a view of your whole environment, such as HP BAC Adapter, HP SiteScope Adapter, IBM Tivoli Monitoring Adapter, and Microsoft SCOM Adapter. It will also plug into Oracle Enterprise Manager to help you monitor your Oracle databases.
In most cases, after connecting vC Ops to these adapters, we need to browse to a custom site other than our vC Ops management site. These sites will have custom dashboards set up with information from the product(s) we are connected to. At present, this information does not really affect the regular vC Ops dashboards. You will still see the same information there that you would see if you did not have the adapters installed and collecting data.
In this chapter, we discussed what vCenter Operations Manager is. It's a solution that allows us to monitor our virtual infrastructure as well as plan for future capacity issues that may arise. We then discussed some of the benefits of troubleshooting our virtual environments as well as doing some capacity planning with the help of vCenter Operations Manager. It's also important to keep in mind that there are several different versions of vC Ops, and it is possible to get it in a bundle with vCenter Configuration Manager, vFabric Hyperic, vCenter Infrastructure Navigator, and vCenter Chargeback Manager. Finally, we went over some of the other hardware and software solutions that we can integrate into our vCenter Operations Manager and/or suite, such as storage arrays and other third-party monitoring solutions.
In the next chapter, we'll actually get started with installing vCenter Operations Manager. Along with this, we'll discuss how to prepare your environment and vCenter Server specifically. We'll also cover some practical information about changing or adding licenses, as that can be a bit of a hassle with the current version of vC Ops. We'll end by configuring some basic setup items to get vC Ops running and usable.