We just recently had a VMFS volume become full due to over-provisioning which caused the VMs on the datastore to stop responding. Typically the solution is easy – free up space on the volume by migrating VMs off the datastore or increase the space on the underlying volume and expand the datastore. Since this was just a development environment, we did not have an enterprise-grade array that provided features such as volume autogrow, nor did we even have the luxury of additional space to add to the volume. We realized we would have to move files off the datastore to free up space to allow the VMs to “breathe” again. We quickly discovered however, that we could not migrate VMs nor delete any files off the volume.
We were prompted with an error when attempting a VM migration or a file deletion from the vSphere client. We also tried removing files via the service console which returned the following error:
rm: cannot remove <filename>: Input/output error
It appeared that the files were locked. Thankfully, we discovered a quick solution. One of the servers in the cluster had a lock on a file on the full volume but had no space to release the lock. The only way to manually force this release was to attempt to remove any one file from from this volume from each of the hosts in the cluster. This command would be successful on whichever host in the cluster was holding the lock.
VMware wrote this KB article stating exactly this solution: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1011592
Thankfully, this worked for us and allowed us to free up enough space to perform normal operations on the VMFS volume and get the stopped VMs running once again.