Storage
In part 5 of this series, we take a look at various types of storage, the thinking behind them, and aspects that make each of them unique (or not).
Storage
In part 5 of this series, we take a look at various types of storage, the thinking behind them, and aspects that make each of them unique (or not).
Here’s a sample of some areas covered:
- What are Disk types, Filesystem, RAID & Cache (an elegant trick)
- Storage types including Local, Central, Block-level, and (2) more!
- A very wise Afterword
Runecast Academy Series 1 – Part 5. Storage
We know from Runecast Academy – 4. Virtual Machines that every virtual machine consists of a few files. We have multiple different options on how to store and access them. Some technology may fit into your datacenter and for your workloads, a bit more than the other, so let's fly through the storage world.
Disks
The path from the Virtual Machine (VM) virtual disk to the physical disk can be quite straightforward or very complex, but on the end of the path will be a Hard Disk Drive (HDD) with rotating magnetic plates or a Solid-State Disk (SSD) with chips storing the data. SSDs have been going down in price and the technology has already been proven, so they are slowly replacing HDDs also in the servers segment – not only because of the data access speed but because they consume less electricity and do not generate as much heat.
Filesystem
To sort the data on the disk somehow, we are formatting the disk with a filesystem. That can be designed for general data, or to store something specific. VMware brought a VMFS filesystem (actually version 6) which is engineered to store VM disks and be accessed from multiple servers at the same time.
RAID
You can have one physical disk inside your server, but Murphy's Law means the disk can likely fail exactly at the time when you really need the data. Also, you can't inflate a single disk when you need to store more bytes and data transfer speed is limited. That is why we use RAID (Redundant Array of Independent Disks) storage technology that combines multiple physical disks into one virtual disk. To create RAID, you need a RAID controller and at least two physical disks. Multiple schemes have been defined and, in practical life, you will most probably meet RAID 5 (consists of at least three disks and protects you from one disk failure) or RAID 6 (minimum four disks and survives failure of two disks at the same time). When the disk fails, you just replace it with a new one.
Cache
We want to access our data as quickly as possible, but don't want to spend a fortune on fast SSDs for our complete array. So we have an elegant trick in our pocket... We can cache the data using an SSD disk or special memory chip. During data write, we can quickly store data to the cache layer and save it to the target disk later. We can also keep regularly accessed data in the read cache so it can be read without waiting for the slower disk.
Local storage
Local storage is a rather cheap option because you already have the server and usually a RAID controller inside, so you just need to buy a few disks. Performance is not typically perfect but can be good enough. Configuration, management, and troubleshooting are very easy. But you are losing high availability and flexibility.
Central storage
How about taking the RAID controller with disks away from the server, place it in a specialized box, and connect the server(s) to that via network? Yes, we have central storage now. It brings some more cables and boxes, but also more stability and flexibility to your datacenter. There are three methods on how to operate with the central storage. You can work with the data as blocks, files, or objects.
Block-level storage
Block-level storage is the most universal one. From the storage box point of view, all the data are simply blocks of ones and zeros. From the ESXi point of view, it is local storage existing somewhere over the corner and it is formatted with VMFS filesystem by the ESXi server.
We have multiple options on how to communicate with such storage:
The oldest, very stable, but slightly pricey method is to use Fibre Channel (FC) Storage Area Network (SAN). You will need special storage switches and HBA adapters on your servers. There is a good reason to use the SAN network, it is designed to access the storage system quickly and safely. Fibre Channel is a lossless protocol, and the SAN network allows you to manage multipathing on all the paths between server and storage, and zoning (ensures that servers are accessing only data that is designated to it).
In the smaller data center, you may want to use iSCSI. This protocol is utilizing a standard TCP/IP network, so you can use your existing network infrastructure and network cards. You should consider using jumbo frames (larger ethernet frames than usual), just make sure that they are enabled on the whole path between your ESXi host and the storage.
The newest option is to use FCoE (Fibre Channel over Ethernet). The idea behind this technology is to use Fibre Channel with a "normal" network infrastructure. The trick is that only a few of the "normal" network switches and "normal" network cards are really supporting FCoE, and you will need to configure them a special way to support FCoE.
File storage
A second option is to use file access. In this case, the storage array is presenting a complete filesystem, NFS, and ESXi is working with files stored on that. NFS storage is connected via standard TCP/IP network. NFS version 4.1 protocol is supported by vSphere and enables you to configure multipathing.
There is a never-ending fight between iSCSI and NFS fans. You will find tons of articles comparing these technologies, comparing everything that can be compared. The reality is that it is perfectly safe to use both of them, just set up your environment according to the vendor recommendations and VMware KBs.
Object-based storage
A bit special of an option is to use object-based storage. This storage understands that we are storing virtual machines, so it can work with the data accordingly. VMware gave us vSAN virtual storage, which is a typical object-based storage. Thanks to this solution, we don't need an external storage box, we can just place disks into our ESXi hosts, connect them with a fast network, define global or per-VM policies, and we have "central storage" with very good performance and high availability. We will look a bit deeper into that in one of our later Runecast Academy articles.
Afterword
Whatever storage solution you choose, keep in mind that even the most bulletproof technology may surprise you. So it’s best to prepare a disaster recovery plan and back up everything. And don't forget that backing up is not necessarily meaning that you will be able to restore it to a consistent state if/when needed.
More Runecast Academy Articles
Networking
In this article, we’ll keep your understanding focused on networking topics related to the vSphere Standard Switch (vSwitch).
All Academy articles