Take a deeper look at various aspects of Virtual Machines (VMs) in Runecast Academy’s elearning series on virtualization technologies.
Welcome back! So far in this series, we’ve covered the basics of what virtualization is and also what a hypervisor is. In part 4 we ask (and answer) two big questions and explain a few other things.
- A Virtual Machine… what’s that?
- Virtual disks, Swap Files, Host Logs, Memory Snapshots, and more!
- “What does it all mean?!” (AKA “This is not voodoo”)
Beginning from this article, we'll focus on VMware vSphere (allowing us to look closer at the topics), but note that most of these principles apply to other virtualization platforms as well.
What is a virtual machine?
While the hypervisor takes the physical resources available to the host and presents them in slices to be consumed as required, virtual machines (VMs) are the objects that consume those resources. A VM is kind of like a regular machine – it has access to compute, storage, and networking resources. Also, like the hypervisor, the virtual machine takes the concepts that a physical computer is composed of and breaks them into some slices of CPU, some chunks of RAM, a virtual Network Interface Card (NIC)... and everything else is rendered as files.
While it is possible for a VM to directly access a block storage device (such as an SCSI disk), and historically there were some good reasons for this to be done, the vast majority of usage of VM storage is done via the construct of the virtual disk.
A virtual disk is just a file on the host Operating System (OS). Actually, that’s not quite true: a virtual disk is usually a number of files, but these are presented as a single disk or disks. In the screenshot of the vSphere Client (more on management later in the series!) we can see a VM that has a single virtual disk:
This virtual disk is residing on the NFS datastore (more on storage later in the series, too!), and is called ‘Lab Jumpbox.vmdk’. It’s thin provisioned, which we’ll cover in more detail in our dedicated chapter on storage. Let’s take a look at that VMDK file in the command-line-interface (CLI).
Wow, so other than that being a big ol’ wall o’text… Didn’t we say that we only had 1 virtual disk? What’s the deal with the other files?
The ‘Lab Jumpbox.vmdk’ is actually a descriptor file, which is why it’s only 554 bytes in size. This file shows where the actual data is in the grand scheme of things. Let’s take a look inside.
Ok, so that’s a lot of information there. We can see that the actual disk data file is called ‘Lab Jumpbox-flat.vmdk’. This is where our data is written, and if we refer back to the previous screenshot we can see that this file is 40GB in size. We can also determine some other things from this – the adapter type it’s connected to, whether it’s thin provisioned, whether VMware Tools is installed to the VM and, if so, which version, and also the virtual hardware version (also known as the VM Compatibility level) of the VM. All of these things will be explained further in this series.
Hopefully, that makes sense? What was that at the back? A question about what all of those other files are? We’re glad you asked! The other files we see in the first terminal screenshot are the .vswp file, the .hlog file, the .vmsn file, the .nvram file, and the .vmx file. There are a few others in there, but they’re not core components of the VM.
This file is the .vswp file, and unless you’re reserving memory for the VM it will be the same size as the amount of RAM allocated to the VM. This is required so that physical memory pages can be swapped out to disk in the event that physical memory becomes constrained. While undesirable (as memory swapped out to disk is orders of magnitude slower than physical RAM), this is a mechanism that can be used to allocate more memory to VMs that you actually have installed in your servers.
The .hlog file is used for tracking migrations of the virtual machine from one datastore to another.
This file is generated if you create a snapshot that includes the VM’s memory. If you capture a snapshot without capturing the memory, then when you restore the snapshot your VM will be in a powered-off state. While a memory snapshot is taken by default, it’s typically only used for forensics.
This is the .vmsd file, and is used to keep track of snapshot delta disks. When a snapshot is taken, the original base disk becomes read-only, and a new -delta.vmdk file is created that handles all writes. These files start small but can grow quickly if there is a large amount of disk i/o. For this reason, it is recommended that you keep snapshots only for as long as absolutely necessary. Snapshots are a great way to back out a change, but they absolutely are not a backup. Don’t be this guy…
There are a few reasons why you shouldn’t use snapshots as backups or keep them for a long time.
Firstly, if you lose your VM you cannot restore from a snapshot (as the snapshot is a part of the VM construct).
Secondly, in order to claim back the disk space consumed by a snapshot, you may need up to 100% of the snapshot as free disk space while the blocks are committed into the base VMDK file. If you run out of disk space, you can’t just delete the snapshot.
Thirdly, there is a performance impact for using snapshots. As the chain gets longer (as you add more snapshots) the performance impact increases. If you want to keep your native performance, you really need to keep snapshots live for the smallest amount of time that you can.
There are other reasons to avoid long-lived snaps, or long snapshot chains, but these are the main ones.
This file holds the BIOS settings for the VM. If you’re using VM Encryption then this file also contains the encryption key to start the encrypted VM (and is in fact also encrypted, though with a different key).
These 2 files hold the configuration of the VM in question. When you create a new VM in the vSphere Client the selections that you make are written (among other data) into the VMX. It is recommended that you never edit these files by hand, as even if the syntax is wrong it can have catastrophic consequences. These files cannot be directly edited if the VM is powered on.
What does this all mean?
Hopefully, this has made things a little clearer, and you now understand that a Virtual Machine is not, in fact, voodoo. A Virtual Machine is a bunch of files, and runs on a Hypervisor of some sort (see the last chapter for more on what a Hypervisor is). This is where you run your applications, and the VM construct allows those applications to move from one physical server to another, so you can manage scheduled downtime for hardware upgrades without taking the application down.
It also means that you can leverage that same capability to move your existing workloads to a cloud service provider, such as VMware Cloud on AWS, Microsoft Azure VMware Solution, or Google Cloud VMware Engine, without needing to rebuild the application.
In our next Runecast Academy article, we dive into the world of storage.