We’ve been experiencing some issues with file locks lately caused by VMotions gone wrong and VM’s that where somehow deleted while still running (?).
So here is a little manual on how to troubleshoot this.
First off, enable SSH on a host within the cluster. It doesn’t matter which one, but if you have a suspect that might be holding the lock, i would recommend picking that one.
Then logon to the host (yes, using SSH) and cd to the VM folder (cd /vmfs/volumes/<datastore name>/<vm name>)
Now lets find out who is holding the lock
vmkfstools -D <vmname>.vmx
The “-D” parameter is one of the hidden parameters, used to reveal metadata.
The output will be something like this
Lock [type 10c00001 offset 131880960 v 79, hb offset 3571712 gen 333, mode 1, owner 5873bf49-950f200a-23bc-0017a4776430 mtime 564987 num 0 gblnum 0 gblgen 0 gblbrk 0] Addr <4, 280, 11>, gen 12, links 1, type reg, flags 0, uid 0, gid 0, mode 100755 len 6017, nb 1 tbz 0, cow 0, newSinceEpoch 1, zla 2, bs 8192
Now the really interesting part is the highlighted part of the second line from the top
gen 333, mode 1, owner 5873bf49-950f200a-23bc-0017a4776430 mtime 564987
Now this matches one of the VMNIC’s of the vSphere host holding the lock. Now use an rvtools export or check all hosts to find the culprit.
When you found the host, enable SSH on it and logon as root.
Now run the following command to find the process locking the files
esxcli vm process list
Scroll through the output until you find the VM, then note the WorldID value
Then kill that process
esxcli vm process kill --type soft --world-id <worldID>
And all is well again 🙂