VMware: Ultimate Guide: Powering on a virtual machine "Unable to access a file since it is locked" with RAW Device Mappings for Windows Clustering
Well, let me you about something weird. I ran into an incident on a server with Windows Clustering 2008. The Quorum disk showed up failed, while it was accessible and writable. Thus troubleshooting that issue I did a restart of the server. BUT, the server didn’t want to come back online and I received the error:
Unable to access a file <unspecified file name> since it is locked
Well, according to VMware there are a couple of options to consider:
- 1. The VM is locked by another server (thinks it’s still online)
- 2. The LUN connection is locked
- 3. The .lcf (lock) file has not been removed.
- 4. The server has not migrated successfully
- 5. VM Data Recovery failed to back-up and locked the server Virtual Disk
- 6. A snapshot has not been successfully created/removed and hangs.
My options were:
- 8. RAW Device Mappings not configured correctly to Iscsi controller.
- 9. An Iscsi controller was missing.
So, I reckoned there should be some guide out there with all the considered options in it. Thus, the first portion of this elaborate post has been copied from others. You can find the link to their website at the end of this post.
The second portion has been written by me and needs to be considered if only RAW Device Mappings are used.
The third portion is on how to open a service request by VMware.
Anyhow, if you see/think anything else needs to be in this blog post. Don’t hesitate to contact me so we can keep it up-to-date, or even more elaborate.
Ah yes, because at the moment I don’t have the time to write my portion
It’s empty and will be filled this weekend. (figures, right) I hope you’ll enjoy the guide and pay any respect due to the original poster (can be found below)
I’ve created an index for easy-scrolling:
Index
Virtual machine does not power on because of missing or locked files.
Using the touch utility to determine if the file can be locked.
Identifying the ESX host that is locking the file.
Removing the .lck file (NFS Only).
Determining if the file is being used by a running virtual machine.
Determining if the .vmdk file is in use by other virtual machines.
Clearing the file lock by rebooting the ESX host.
How to remove failed depencies / Snapshots of VMware Data Recovery.
Active snapshots but can’t see them in Snapshot Manager
We will first discuss how to correct the issues explained by VMware:
Virtual machine does not power on because of missing or locked files
Details
- Virtual machine cannot power on
- When powering on the virtual machine, you see one of these errors:
- Unable to open Swap File
- Unable to access a file since it is locked
- Unable to access Virtual machine configuration
- In the /var/log/vmkernel, you see entries similar to:
WARNING: World: VM xxxx: xxx: Failed to open swap file <path>: Lock was not free
WARNING: World: VM xxxx: xxx: Failed to initialize swap file <path> - When opening a console within Lab Manager, you may receive the error:
Error connecting to <path><virtualmachine>.vmx because the VMX is not started
- Powering on the virtual machine results in the power on task remaining at 95% indefinitely
- Cannot power on the virtual machine after deploying it from a template
- The virtual machine reports conflicting power states between vCenter Server and the ESX host console
Solution
To prevent duplicate access to active virtual machine files, ESX hosts establish a lock on these files. In certain circumstances, these locks may not be released when the virtual machine is powered off. The files cannot be accessed if they are locked, and the virtual machine cannot power on.
These virtual machine files are commonly affected by lock issues:
-
<VMNAME>.vswp
-
<DISKNAME>-flat.vmdk
-
<DISKNAME>-<ITERATION>-delta.vmdk
-
<VMNAME>.vmx
-
vmware.log
Identifying the locked file
To identify the locked file, try to power on the virtual machine. During power on, an error may display or be written to the virtual machine's log. The error and the log entry identify the virtual machine.
-
Where applicable, open and connect the VMware Infrastructure (VI) or vSphere Client to the respective ESX host, VirtualCenter Server, or the vCenter Server hostname or IP address.
-
Locate the affected virtual machine, and attempt to power it on.
-
Open a remote console window for the virtual machine.
-
If the virtual machine is unable to power on, an error on the remote console screen will display with the name of the affected file. If an error does not display, proceed to the following steps to review the vmware.log file of the virtual machine.
-
Log in as root to the ESX host using an SSH client.
-
To confirm that the virtual machine is registered on the server and obtain the full path to the virtual machine, run the command:
[root@esxhostname]# vmware-cmd -l
The output returns a list of the virtual machines registered to the ESX host. Each line contains the full path of a virtual machine's .vmx file. For example:
/vmfs/volumes/<UUID>/<VMDIR>/<VMNAME>.vmx
Note: Record this information as it will be required in the remainder of this process. This is the <path.vmx> referenced in the remainder of the article. It is also case-sensitive.
Verify that the affected virtual machine appears in this list. If it is not listed, the virtual machine is not registered on this ESX host. The host on which the virtual machine is registered typically holds the lock. Ensure that you are connected to the proper host before proceeding. -
To move to the virtual machine's directory, run the command:
[root@esxhostname]# cd /vmfs/volumes/<UUID>/<VMDIR>
-
Use a text viewer to read the contents of the vmware.log file. At the end of the file, look for error messages that identify the affected file.
Using the touch utility to determine if the file can be locked
The touch utility is designed to update the access and modification date and time stamp of a file or directory, and is bundled with VMware ESX. The touch command can be used to test the file and directory locking mechanism in the VMFS filesystem. Using touch is the preferred method because the changes to the resource are minimal and require a lock on the file.
To test the file or directory locking functionality, run the following command:
[root@esxhostname]# touch <filename>
Note: Performing a touch * command performs the operation on all files in the current directory.
The above command can result in the following outcomes:
-
If the above command succeeds, then the command successfully made changes to the date/time stamp and has verified that the file can and has been locked (then unlocked). At this point, retry the operation to see if it succeeds.
-
If the above command fails with a device or resource busy message, it indicates that the file or directory locking mechanism is functioning, but something is maintaining a lock on the file or directory. If the message is reported, proceed to the next section.
-
If another error message is reported, it may indicate that the data pertaining to file or directory locking may not be valid. If this is the case, collect diagnostic information from the VMware ESX host and submit a support request. For more information, see Collecting Diagnostic Information for VMware Products (1008524) and How to Submit a Support Request.
Identifying the ESX host that is locking the file
Because a virtual machine can be moved between hosts, the host where the virtual machine is currently registered may not be the host maintaining the file lock. The lock must be released by the ESX host that owns the lock. This host is identified by the MAC address of the primary Service Console interface.
To identify the host and its Service Console interface by its MAC address:
-
To report the MAC address of the lock holder, run the command:
[root@esxhostname]# vmkfstools -D /vmfs/volumes/<UUID>/<VMDIR>/<LOCKEDFILE.xxx>
- This command writes the MAC address of any host that is locking the . vmdk file to the vmkernel log file. To locate this information, run the command:
[root@esxhostname]# tail /var/log/vmkernel
Look for lines similar to the following:
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Lock [type 10c00001 offset 13058048 v 20, hb offset 3499520
Apr 5 09:45:26 Hostname vmkernel: gen 532, mode 1, owner 45feb537-9c52009b-e812-00137266e200 mtime 1174669462]
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Addr <4, 136, 2>, gen 19, links 1, type reg, flags 0x0, uid 0, gid 0, mode 600
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)len 297795584, nb 142 tbz 0, zla 1, bs 2097152
Apr 5 09:45:26 Hostname vmkernel: 17:00:38:46.977 cpu1:1033)FS3: 132: <END supp167-w2k3-VC-a3112729.vswp>
(END)The second line, where highlighted, displays the MAC address after the word owner. In this example, the MAC address is 00:13:72:66:E2:00.
-
To determine if the MAC address corresponds to the host that you are currently logged into, see Identifying the ESX Service Console MAC address (1001167). If it does not correspond, you must establish a console or SSH connection to each host that can run this virtual machine. Once identified, use the host in the following procedures.
Note: If this process does not reveal the MAC address, it is possible that it is an NFS lock (for more information, see the section Removing the .lck file (NFS Only)), the file is locked by a Console OS process, the file is locked by a VMkernel child world. -
If, due to power failure or other reasons, you are unable to obtain the MAC address, you must then manually migrate the virtual machine to each host in the cluster. At each host, attempt to power on the virtual machine. When you reach the host that holds the lock, it successfully powers on.
Removing the .lck file (NFS Only)
The virtual machine's files may be locked via NFS storage. You can identify this by files denoted with .lck.#### (where #### refers to the World ID that has the file lock) at the end of the filename. This is an NFS file lock. These can be removed safely, as long as the virtual machine is not running on any other ESX host.
Note: VMFS volumes do not have .lck files. The locking mechanism for VMFS volumes is handled within VMFS metadata on the volume.
Determining if the file is being used by a running virtual machine
If the file is being accessed by a running virtual machine, the lock cannot be usurped or removed. It is possible that the lockholder host is running the virtual machine and has become unresponsive, or another running virtual machine has the disk incorrectly added to its configuration prior to your power-on attempts. See the next section, Determining if the .vmdk file is in use by other virtual machines.
To determine if the virtual machine processes are running:
- To determine if the virtual machine is registered on the server, run the command:
[root@esxhostname]# vmware-cmd -l
Note: If the virtual machine is registered on more than one ESX host, see Virtual machines appear to be running or registered on multiple ESX servers (1005051).
- To assess the virtual machines current state, run the command:
[root@esxhostname]# vmware-cmd <path.vmx> getstate
If the output from this command is getstate() = on, the virtual machine has become unresponsive. To address this issue, see Troubleshooting a virtual machine that has stopped responding (1007819).
If the output from this command is getstate() = off, the ESX host may be unaware of the file lock. - To stop the virtual machine process, see Powering off a virtual machine hosted on ESX host from the command line (1004340).
Determining if the .vmdk file is in use by other virtual machines
A lock on the .vmdk file can prevent a virtual machine from starting. However, since virtual machine disk files can be configured for use with any virtual machine, the file may be locked by another virtual machine that is currently running.
To determine if the virtual machine's disk file is configured for use on more than one virtual machine, run:
[root@esxhostname]# egrep -i <DISKNAME>.vmdk /vmfs/volumes/*/*/*.vmx
Notes:
-
This command attempts to locate the specified disk name among all .vmx configuration files for the virtual machines that are visible to the ESX host. A Device or resource busy message is printed for each virtual machine that is running but not registered to this ESX host. You must run this command on each ESX host in the infrastructure or specifically on ESX hosts that have access to the storage containing the virtual machine's files.
-
If any additional virtual machines are configured to use the disk, determine if they are currently running. Powering off the other virtual machine using the disk file releases the lock. You must determine which virtual machine should have ownership of the file, then reconfigure your virtual machines to prevent this error from occurring again.
If the .vmdk file is not used by other virtual machines, proceed to the next section to remove the lock.
Removing the lock
Caution: Follow these sections in order. If removing the .vswp file does not unlock the file, try clearing the lock with the touch command. If that does not resolve the issue, try rebooting the ESX host. Do not skip a section.
Removing the .vswp file
The .vswp file is used by running virtual machines as memory swap space. It is typically deleted when the virtual machine is powered off. If this file and its reference in the .vmx file still exist when powering back on, it can, but does not always, prevent the virtual machine from starting up again. If the virtual machine is not running, this file can safely be deleted.
To remove the .vswp file:
- To move to the virtual machine's directory, run the command:
[root@esxhostname]# cd /vmfs/volumes/<UUID>/<VMDIR>
- To delete the .vswp file, run the command:
[root@esxhostname]# rm /vmfs/volumes/<UUID>/<VMDIR>/<VMNAME>.vswp
Note: Depending on the lock held on this file, it may or may not be successfully removed.
- To backup the virtual machine's . vmx file, run the command:
[root@esxhostname]# cp <VMNAME>.vmx <VMNAME>.vmx.ba
- Use a text editor to open the . vmx file for the virtual machine. Locate and delete the line:
sched.swap.derivedName
- Save and exit the file.
- Try to power on the virtual machine. If the issue persists, proceed to the next section to remove the lock.
Clearing the file lock by rebooting the ESX host
As a final troubleshooting step, try restarting the ESX host that holds the lock.
To restart the ESX host:
Note: Prior to restarting the entire VMware ESX host, restart the management agents, as they may have child worlds or processes that are maintaining locks or performing operations against the required files. For more information see Restarting the Management agents on an ESX or ESXi Server (1003490).
- Migrate all virtual machines from the host to new hosts.
- When the virtual machine are moved, place the host in maintenance mode and reboot.
Warning: If you have only one ESX host or do not have the ability to migrate virtual machines, you must schedule a downtime for all affected virtual machines prior to rebooting. When the host has rebooted, start the affected virtual machine.
Powering on a virtual machine or trying to remove storage fails with errors "Unable to access a file since it is locked" or "Resource is in use"
Symptoms
You are experiencing one or more of these issues:
-
Powering on a virtual machine fails
-
If you try to power on a virtual machine, you see the error:
Unable to access a file since it is locked -
Removing an NFS datastore fails
-
If you try to remove an NFS datastore, you see the error:
Resource is in use
Resolution
NFS locking on ESX uses its own locking protocol, not the Network Lock Manager (NLM) protocol. NFS locks are implemented by creating lock files on the NFS server.
Lock files are named .lck-<fileid>, where <fileid> is the value of the fileid field returned from a GETATTR request.
To power on a virtual machine and remove a locked NFS datastore:
-
Log in to the ESX host using an SSH client. For more information, see Connecting to an ESX host using a SSH client (1019852).
-
Navigate to the directory where the virtual machine resides.
-
Determine the name of the NFS lock (.lck) file with the command:
ls -la
The output appears similar to:
total 47383432
drwxr-xr-x 1 root root 4096 Jul 17 07:28 .
drwxr-xr-x 1 root root 4096 Jul 14 08:37 ..
-rwxrwxrwx 1 root root 84 Jul 17 2009 .lck-1ab0140000000000 -
Determine the ESX host that created the .lck file with the command:
strings <.lck filename>
The output appears similar to:
esxhost2.domain.local -
Connect to the ESX host that created the .lck file (esxhost2.domain.local, in this example) using SSH.
-
Determine the ESX host's virtual machine ID (VMID) with the command:
vm-support -x
The output appears similar to:
VMware ESX Server Support Script 1.29
Available worlds to debug:
vmid=1077 VMNAME -
If the virtual machine is running, connect to the ESX host using vSphere Client and verify that it is powered on.
-
If you want to remove a NFS datastore, power off the virtual machine and remove the datastore.
Note: If a virtual machine does not show as running in the vSphere Client or you are unable to stop it, terminate the virtual machine. For more information, see:
If you do not know which virtual machines are running on the NFS, you can find all the locked files inside the NFS datastore with the command:
# find /vmfs/volumes/<NFSDatastoreName>/ -iname ".lck*"
For example:
# find /vmfs/volumes/dc03-nfs/ -iname ".lck*"
The output appears similar to:
/vmfs/volumes/dc03-nfs/DebianVM/.lck-6604000000000000
/vmfs/volumes/dc03-nfs/DebianVM/.lck-7004000000000000
/vmfs/volumes/dc03-nfs/DebianVM/.lck-7204000000000000
/vmfs/volumes/dc03-nfs/New Virtual Machine/.lck-d71d000000000000
/vmfs/volumes/dc03-nfs/New Virtual Machine/.lck-da1d000000000000
Note: If you do not have any virtual machines on the NFS mount, the command # find /vmfs/volumes/<NFSDatastoreName>/ -iname ".lck*" may not help you.
If your virtual machines have ISO images mounted from the NFS, they have to be removed from the virtual machine settings to release the locks.
For example, run the command:
# find /vmfs/volumes/ISOs/ -iname ".lck*"
The output appears similar to:
/vmfs/volumes/ISOs/OS/redhat/.lck-1340d10000000000
/vmfs/volumes/ISOs/OS/redhat/.lck-1440d10000000000
/vmfs/volumes/ISOs/OS/redhat/.lck-1540d10000000000
/vmfs/volumes/ISOs/OS/redhat/.lck-0540d10000000000
/vmfs/volumes/ISOs/OS/redhat/.lck-0640d10000000000
/vmfs/volumes/ISOs/OS/redhat/.lck-0740d10000000000
Note: The output indicates that the ISOs are locked.
To find the virtual machines that have ISOs mounted, run the command:
# grep -i ".iso" /vmfs/volumes/*/*/*vmx
The output appears similar to:
/vmfs/volumes/4b73f167-c9f86304-50e6-001517ab928b/FreeNAS-RDM's/FreeNAS-RDM's.vmx:ide1:0.fileName = "/vmfs/volumes/0bd57674-c63b0c55/FreeNAS-i386-0.69RC1.3991.iso"
How to remove failed depencies / Snapshots of VMware Data Recovery
The message "Unable to access file since it is locked” appears in the Recent tasks panel.
Since the filename is unspecified, it makes it hard to figure out what the issue is.
Why did it happen? Well, first I have to tell you that something went wrong in a backup process of one of my VMs, specifically the Virtual Center Server (in my case, this is a VM). It could be ANY VM, but in this case is this one. The last time, the same issue happened with a different VM, but anyway, the problem is the same.
I figured out that the job failed, and It couldn't remove the snapshot. (VDR appliance creates a snapshot, and then copies the contents to the destination; after that, the snapshot is removed). But something went wrong, and the snapshot was not removed. It caused all future jobs also failed. I tried manually creating a snapshot for the failed VM, and then remove it, to force it to "Delete All" the unused snapshots, but the procedure failed, giving me the same error: "unable to access file since it is locked" ... mmmm, who is locking and which file ?????
I tried also moving the VM to another ESX in the cluster; restarted vmware management services; restarted the VM; restarted the ESX host itself, but no luck.
Ok, you got it, VDR is locking it .... but which file ???? I searched hours in google and Vmware KB, but nothing ... I opened a ticked with VMware. It took one day for them to call me back, just to acknowledge the ticket. It took another 3 days to have an email from them asking me for uploading the log files (I already did it at the time I opened the ticket !).It took another 3 days for them to call me, but I was busy and I couldn't work with them, after three more days, I called them again, but they told me they will call me back ... GRRR. I hate VMware support procedures and times.
Don't worry, I solved it by myself, and let me tell you how:
1) Shutdown the VDR appliance. It will free up the locked files in your VMs that were not sucessfuly backed up.
2) Create manually a snapshot in every VM with the problem, then "Delete all" snapshots will work !, you won't get that error message again.
3) "Try" to power on the VDR (VMware Data Recovery) appliance... Oh, no, the same message again! And now, I cannot power up the Virtual Machine !
4) I found the VDR "mounts" the hard disks of the VMs it is creating the backups, so, go to the VDR, and in commands "Edit settings".
By default, the VDR has only one hard disk, but mine shows three: those two extra hard disks corresponds to the Virtual Machine the backup failed!
Look at the hard disk description path for the first hard disk drive, and disk mode independent checkbox is not checked.
Look at the extra hard disks added to the VDR, see the hard disk description path (it corresponds to the VM that was in progress of backup). It also has the independent checkbox checked.
Select one by one the extra hard disks and click "Remove". Be very careful here, selecting just remove, and DO NOT dele files from disk.
Also, confirm that you selected the right hard disks ! If you make a mistake, just hit Cancel and do it again.
5) Verify that you have the single and right hard disk in place.
6) Power on the (VMware Data Recovery) appliance. Problem solved !
Active snapshots but can’t see them in Snapshot Manager
Today I got some problems with backing up one of my VM’s using VMware Data Recovery. I was looking what was going wrong after a few failures.. And yes.. hanging snapshots
Trying to remove the snapshots was a little bit different. When I went into the “Snapshot Manager” there were no active snapshots and I couldn’t delete the files. But while browsing the Datastore I saw many active snapshots.
Current situation:
- You can “Revert to Current Snapshot” via menu..
- But.. in the “Snapshot Manager” I didn’t see the active snapshots and my options were grayed out.
- Browsing the Datastore there are many active snapshots..
Resolve the issue: Option 1
- Create a new snapshots in the menu, call it “Test”;
- Goto Snapshot Manager. You can now see the “Consolidate Helper –0” as active snapshot + my own created snapshot: “test”
- Select your “Test” snapshot and press “Delete All”
Hopefully your problem is solved ![]()
Resolve the issue: Option 2
Of course in my case it didn’t solve my problem, I received this errors:
Unable to access file <unspecified filename> since it is locked
A general system error occurred: Protocol error from VMX.
This problem occurred after creating a backup with VMware Data Recovery, so I decided to check my Data Recovery logs and VM settings. After checking my VM settings I saw Hard disk 2. I thought I’d never added a second disk on this VM?
- Hard Disk 2 = Disk file location [SAN OS] SDENERGIE01/SDENERGIE01_1.vmdk
- Hard Disk 2 = Independent
- Hard Disk 2 = Also mounted on “SDENERGIE01” VM
So the only explanation was that the disk is also mounted on a second VM as a Independent disk. That is why I received “Unable to access file <unspecified filename> since it is locked”.
- Power Off the VMware Data Recovery VM
- Select Hard Disk 2 (the extra hard disks) and click “Remove”. Be careful here, selecting just “Remove from Virtual Machnie”, and don’t delete files from disk!
- Power On “VMware Data Recovery” VM
- Repeat solution Option 1 (^ above)
Problem solved.
Configuring RAW Device Mapping for Fail-over Cluster
(Will be filled this weekend)
Opening a Service Request
If your problem still exists after trying the steps in this article:
-
Gather diagnostic information. For more information, see Collecting diagnostic information for VMware Products (1008524).
-
File a support request with VMware Support and note this KB Article ID in the problem description. For more information, see How to Submit a Support Request.
Note: For related information, see Cannot power on a virtual machine because the virtual disk cannot be opened (1004232) .
Sources:
Tech Blogspot: unable to access file unspecified filename since it is locked
VMpros: Active snapshots but can’t see them in Snapshot Manager
VMware KB: Virtual machine does not power on because of missing or locked files
Related posts:





















October 8th, 2010 - 17:25
Nice information, thanks for sharing!