Abstract
Here are some secrets, tips, and tricks for virtualizing your datacenter. We want to introduce some best practices for virtualization, while not being too biased towards one virtualization vendor or another. We'll use some common examples of products and tools that work with VMware's vSphere and Microsoft's Hyper-V, but with an eye toward virtualization in general, and not the specifics of any of the capable platforms that could be used). We will assume, however, that bare metal hypervisors, in other words virtualization platforms where the hyper visor is the OS, will be used as opposed to running a hypervisor on top of an existing general-purpose operating system (which is great in a lab, but terrible for data center projects).
Sample
Plan
The dreaded word in IT seems to be Plan. In years of training and consulting, I've found that few people in IT like it, fewer do it, and even fewer like to do it. That causes big headaches, as without planning, anything will do. Simply buy whatever you have budget for, plug it all together, and go to work. The problem with this is that without proper planning and requirements gathering, there is a great chance that you will buy less than is really needed to get the job done, so performance will be poor, requiring more money to go in and fix it later. On the other hand, on the off chance that you buy too much, you end up lowering the ROI and increasing the TCO, just the opposite of the goals of virtualization.
So what do you need to know to properly plan for a datacenter virtualization project? First and foremost, you need to know what you are consuming today. For example, you'll need answers to questions like these (both on average as well at peak periods):
How many GB of RAM is in use in each server? By this we mean, actually used, not simply installed.
How many MHz (or GHz) are in use in each server? Again, it's not what is installed, but what is in use.
How many GBs (or TBs) of storage are used in each server?
How many MB/s or I/O operations per second (IOPS) are being consumed today?
What kind of storage is in use today? Is it Fibre Channel, Fibre Channel over Ethernet (FCoE), iSCSI, or NAS-based, or do you use local storage? If you use local storage today, will you use shared storage in the future? What local storage will be required, if any?
How much bandwidth is required on the network?
What kind of security is required for the data being stored?
We could go on and on, but you get the idea. Another important thing here is to figure out what is "normal" or average and when peak periods are, with a goal of not having VMs all peaking at the same time on the same server. Maybe they can be scheduled differently so the peaks are at different times (for example, when backups are run or antivirus scans are executed), or maybe they can be placed on different physical servers to spread the load and reduce the peak demand on any one server. If this is not possible, you'll need to buy more capacity to handle the larger peaks that the combined load will cause.
This particular section is closely related to the next three - planning so that the appropriate hardware can be purchased that will work well together and will be sized properly, sizing storage to handle the load, and once the system is in place and operational, planning for the future based on real-world conditions as they evolve. In other words, on-going planning as opposed to the upfront planning discussed in this section. We'll discuss each in the next few sections in more detail.
Balance Hardware Components
It is very important to balance all of the hardware components properly; in other words the goal is to keep all of the equipment roughly evenly loaded to minimize cost and maximize utilization (within reason).
For example, you don't want a 10 Gb Ethernet network to handle all of your iSCSI needs paired with a low-end iSCSI device that can't connect at 10 Gb, or an iSCSI array that can't push data effectively at 10 Gb. In this case, it would be better to save some money on the networking equipment and put it into better storage.
Likewise, on the CPU and memory side, the goal is to be balanced as well; in other words enough RAM to run all of the applications and keep the CPUs fairly busy (averaging 60% to 75% is fairly normal). If there is a lot more RAM that the CPUs can effectively use (for example, CPU-intensive tasks that require modest amounts of memory), the extra RAM is wasted. On the other hand, if you have many machines that are not CPU-intensive and they all run on the same host, you may exhaust the RAM available, causing lots of swapping to disk, drastically reducing performance.
The challenges of trying to balance everything well, while at the same time leaving some resources available to handle outages (both planned and unplanned) and future growth can be somewhat daunting. This is why planning is so important.