VMware ESX/vSphere – How to Safely Kill an Unresponsive Virtual Machine (VM)

I was helping a customer with their VMware environment which was unresponsive and we needed to shutdown all virtual machines. The typical methods (using vSphere Client, RDP, etc) were all unresponsive as the VM itself was hanging.

I found a very helpful article on how to safely kill an unresponsive virtual machine, which worked quite well. It also kills the VM so that there is no corruption (on the VMware side). Here’s the link to the source article: http://sysadminhell.blogspot.com/2008/03/couple-of-vmware-issues.html

Here’s how you do it:
1. SSH into the ESX server that is currently running the affected VM (or you can use the console).
2. At the cmd prompt enter: cat /proc/vmware/vm/*/names

This lists the running VM’s on the host server you are logged on to. Look for the vmid=##

vmid=1069 pid=-1 cfgFile=”/vmfs/volumes/45…/server1/server1.vmx”
uuid=”50…” displayName=”server1″
vmid=1107 pid=-1 cfgFile=”/vmfs/volumes/45…/server2/server2.vmx”
uuid=”50…” displayName=”server2″
vmid=1149 pid=-1 cfgFile=”/vmfs/volumes/45…/server3/server3.vmx”
uuid=”50…” displayName=”server3″
vmid=1156 pid=-1 cfgFile=”/vmfs/volumes/45…/server4/server4.vmx”
uuid=”50…” displayName=”server4″

3. At the cmd prompt enter: less -S /proc/vmware/vm/1149/cpu/status

It will now clear the console screen and show a bunch of numbers and stats. Hit the right arrow key until you see the section about group. Example:

group
vm.1058

With this ID number you can safely kill the VM without corrupting it.

4. At the cmd prompt enter: /usr/lib/vmware/bin/vmkload_app -k 9 1058

(Then number 1058 in the command is an example; your VM’s group number goes here.)

5. If you see “Warning: Apr 20 16:22:22.710: Sending signal ‘9’ to world 1058.” this means your VM has been closed successfully. You can now start your VM back up and run it.