Are Big Boxes Better For Server Virtualization?

Galen SchreckFrom time to time, someone will ask me if it makes sense to purchase a large (32+ CPU) server as a big virtualization host running a VMware, Microsoft, or Citrix hypervisor.

In a word, I think the answer is "no".

Here are a few reasons why:

  • First, very few virtualization users that I’m aware of have selected anything larger than 4-socket systems. Massive servers with 32 processors simply cost too much on a per-VM basis, and since you end up dividing them up into smaller virtual machines, there’s not much benefit to go with a massive system. Most hypervisors don’t support more than four virtual processors per virtual machine (including VMware -- you have to purchase the top-of-the-line Enterprise Plus to get 8-way virtual SMP).
  • Second, many of the HA features in large SMP systems can be provided by clustering or live migration capabilities in software.
  • Third, having all that capacity in one box used to make it easier to reallocate resources between partitions -- but live migration across physical boxes makes that unnecessary as well.
  • Lastly, some vendors have claimed that virtual environments built on big servers are easier to manage, since there are fewer physical endpoints to worry about. I think this is an older argument that dates back to when people purchased a Sun E10K to run a large number of websites. Ultimately, more automated management tools made it possible to run large numbers of scale-out servers at a similar cost.

Even among modular servers, there is some debate as to how big of a box you need. The recent release of the Intel Xeon 5500 allows you to manage a lot more memory with fewer physical CPUs. In order to provide the maximum amount of memory to their VMs, many companies purchased more processors than they needed -- simply so they could get enough RAM. Systems were still mostly memory bound, so with the release of the new Xeon 5500 that can address more DIMMS, we think that some firms will temporarily swing back towards systems with fewer CPUs. Plus, the CPUs have gotten much faster -- recent demonstrations by Intel are showing the new Xeon 5500-based servers running twice the workload of their prior models.

Excellent observations. Let me also add a 5th to your list: I/O bottlenecks. To get around this you need to add even more NIC/HBAs and cables etc. which further adds cost, complexity, and messes with migration, HA and DR issues. For example, should the HW crash, you'll need an equivalently-provisioned (and expensive) box to fail-over to.

Let me add 6th point based on 5th point above.Do you know how much Virtualization helps in saving the provisioning, development and testing, I don't think it is considered in your 5th point.Comming to HA and DR for Tier1 apps can you do without 2 physical boxes "no" hence virtualization is better in terms of achiving this as well as resource utilization also.

Savings on provisioning or ongoing management (for production or test systems) are a little tricky to calculate. The reason is, it depends on how advanced your capabilities are prior to virtualizaion.In some cases, I have spoken with clients who have had incredible increases in administrator productivity. In many cases, those IT shops didn't have much in the way of automation. Being able to provision a system from a virtual disk template was a huge improvement for them.Other companies (probably a minority) have automated configuration management tools like HP Opsware or BMC BladeLogic. For these companies, the increase in efficiency seems to be somewhat less -- since they already had the ability to rapidly build a system from a stored configuration.