6 Tips for Troubleshooting VMware vSphere 5

Performance Monitoring Tools

The vSphere performance charts allow you to display useful information when you are connected either to the ESXi host directly or to the vCenter Server. The performance charts can provide a lot of useful information, even if they do not provide all of the counters that you will find with esxtop.

The host based tool esxtop provides for some inherent advantages over the vSphere performance charts and third-party tools when it comes to performance analysis. One big advantage is that esxtop incurs very little overhead on the ESXi host. Since, esxtop is lightweight and the footprint is small, it is an excellent tool to measure performance. If you have a situation where poor performance is affecting connectivity to the host, you can use resxtop (remote esxtop). Another advantage of using esxtop is you can export the data into a comma delimited file.

If You Suspect a Network Performance Issue, Check Some of the Following Metrics

  • If the droppedRx (receive) is greater than 0 for a host, look at the CPU utilization. Check metrics such as CPU overhead and high CPU utilization, which can cause the VM to be too busy to take on new packets or delays in receiving the packets. A possible solution is to increase CPU reservations for the VM or check the application to see if it supports adding more vCPUs.
  • If the droppedTx (transmit) is greater than 0, this usually means congestion at the physical layer. When a VM is transmitting packets, the packets get queued in the buffer of the virtual switch port until the packets are transmitted on the physical nic. To prevent the dropping of transmit packets, look for ways to increase the physical network capabilities, such as adding more nics or adding 10 GB Ethernet.
  • Make sure you have the correct network device driver installed on the VM. By default, if VMware tools is not installed or running, the Vlance network adapter will be used. Vlance is a 10Mbps NIC, which is great for older 32-bit guest operating systems but not so useful running in a 1 GB Ethernet network.

Metrics to Check for a Possible Storage Problem

esxtop/resxtop, which comes with ESXi 5, is an excellent tool to measure performance. Some of the more significant statistics are commands queued. To check these metrics, open a vSphere Management Assistant (vMA) console and start resxtop. Type d to enter the Storage Adapter screen. Type f to select the fields that you want to view. The fields to view should be A (adapter name), F (queue stats), and K (error stats).

There are other esxtop fields that can be utilized to indicate that there could be a storage problem. To identify disk-related performance problems look at throughput and latency.

  • Throughput fields in esxtop (READS/s + WRITES/s = I/O operations/second (IOPS):READS/s – Number of disk reads per secondWRITES/s – Number of disk writes per second
  • Latency fields in esxtop:
    DAVG – Average delay from the adapter to the target in ms, value greater than 10-15 milliseconds indicates that the storage might be slow or overutilizedKAVG – Average delay from the vmkernel to the adapter in ms, value greater than 4 milliseconds indicates the VMs are attempting to send more data to storage than the storage can handle

    GAVG – Average delay for the guest, which will be DAVG + KAVG = GAVG

Log Files to View in vSphere 5

All log messages are now generated by syslog, and messages can now be logged on either local and one or more remote log servers, or both. In addition, a given log server can log messages from more than one ESXi host.

To view ESXi system logs, in the vSphere Client menu bar, select View > Administration > System Logs.

/var/log/auth.log ESXi Shell authentication success and failure
/var/log/dhclient.log DHCP client service
/var/log/esxupdate.log ESXi patches and updates log
/var/log/hostd.log Host management service logs
/var/log/shell.log ESXi Shell usage, including enable/disable and every command entered
/var/log/sysboot.log VMkernel startup and module loading
/var/log/syslog.log Management service initialization, watchdogs, scheduled tasks, DCUI
/var/log/usb.log USB device arbitration events, such as discovery and pass-through to VM
/var/log/vob.log VMkernel Observation events, similar to vob.component.event
/var/log/vmkernel.log Core VMkernel logs (devices, storagage/network device/driver events, and VM
startup.
/var/log/vmkwarning.log VMkernel Warning and Alert log messages.
/var/log/vmksummary.log ESXi startup/shutdown, uptime, VMs running, and service resource consumption

Logs from vCenter Server Components on ESXi 5

/var/log/vpxa.log vCenter vpxa agent logs
/var/log/fdm.log High Availability logs, produced by the Fault Domain Manager (FDM) service

Last-Level Cache (LLC) Performance Issue

The ESXi CPU scheduler, by default, tries to place the vCPUs of a Symmetric Multiprocessor (SMP) VM into as much Last-Level Cache (LLC) as possible. ESXi, by default, is going to place as many vCPUs of a SMP VM into as many of the L LLCs as possible. Therefore, ESXi is going to attempt to spread out the cycle, and find space to run the workload. If you are running a very

CPU-intensive workload, you might benefit from setting up a clone of the application VM. Then turn on the LLC setting below and test the cloned application to see if there is a performance increase. If the modification works, the CPU scheduler is going to attempt to consolidate the vSMP VM into one CPU package, thus one shared LLC pool. Therefore, the CPU scheduler will now attempt to run the VM on the same package more than it would otherwise.

Using the vSphere client:

  • Power off the VM.
  • Right click the VM and select Edit Settings.
  • Select the Options tab.
  • Under Advanced, click General, and on the right click the configuration Parameters button.
  • Click Add Row.
  • Add sched.cpu.vsmpConsolidate set to true.
  • Power on the VM.

From the command line interface:

  • Power off the VM.
  • Add the following line into the configuration file (.vmx) of the VM.
  • sched.cpu.vsmpConsolidate = “true”.
  • Power on the VM.

Cannot Migrate a VM Using VMotion

Check the ESX(i) hosts to make sure all of the requirements have been met. Then check to make sure that the CPU is compatible. If the VM is running a 64-bit operating system, the problem might be that the source machine has Intel Virtualization Technology (VT) enabled in the BIOS, and the destination host does not have VT enabled in the BIOS. If this is the case, you will have to make a change in the BIOS so both hosts match. Also, both hosts must have a VMkernel port on the same LAN. The IP address and subnet mask should match the network configuration for VMKernel gateway.

  • You could run from command line:
    # vmkping <Destination_IP _address> to test the VMkernel TCP/IP stack.
  • Any VLAN settings should match the VLAN configuration of the local LAN. VMkernel ports should have the check box VMotion enabled. There should be no router separating the hosts.
  • Check the VMs to make sure all of the requirements have been met.
  • Check that there are no local devices connected to the VM.
  • Check CD-ROM mappings to any ISO file on local storage, Floppy, SCSI, USB, CPU affinity, .vswp files stored on local storage.
  • Check that the VM has enough CPU and memory resources on the destination host

Excerpted and available for download from Global Knowledge White Paper: Seven Tips for Troubleshooting VMware vSphere5

Related Courses
VMware vSphere: Fast Track [V5.0]
VMware vSphere: Optimize and Scale [V5.0]

In this article

Join the Conversation

2 comments

  1. Emma Tameside Reply

    This is some great reading. I recently enrolled on a vmware course and every piece of extra information I can get counts! I’m mainly taking the course for career progression, as the line of business I’m in really required these type of skills.

    I’ll be reading over your blog a lot more from now, this was an extremely interesting read. Keep up the good work, and all the best!

  2. ravi Reply

    whenever i’m trying to login with my credentials into the vmware vsphere client i’m getting the below errror
    “interface not registered(exception fro hresult:0*80040155)”
    please tell me the solution