I’ve recently had the opportunity to talk with a small sample of SLES 11 and RH 6 Linux users, all developing their own applications. All were long-time Linux users, and two of them, one in travel services and one in financial services, had applications that can be described as both large and mission-critical.
The overall message is encouraging for Linux advocates, both the calm rational type as well as those who approach it with near-religious fervor. The latest releases from SUSE and Red Hat, both based on the 2.6.32 Linux kernel, show significant improvements in scalability and modest improvements in iso-configuration performance. One user reported that an application that previously had maxed out at 24 cores with SLES 10 was now nearing production certification with 48 cores under SLES 11. Performance scalability was reported as “not linear, but worth doing the upgrade.”
Overall memory scalability under Linux is still a question mark, since the widely available x86 platforms do not exceed 3 TB of memory, but initial reports from a user familiar with HP’s DL 980 verify that the new Linux Kernel can reliably manage at least 2TB of RAM under heavy load.
File system options continue to expand as well. The older Linux FS standard, ETX4, which can scale to “only” 16 TB, has been joined by additional options such as XFS (contributed by SGI), which has been implemented in several installations with file systems in excess of 100 TB, relieving a limitation that may have been more psychological than practical for most users.
I just spent some time talking to ScaleMP, an interesting niche player that provides a server virtualization solution. What is interesting about ScaleMP is that rather than splitting a single physical server into multiple VMs, they are the only successful offering (to the best of my knowledge) that allows I&O groups to scale up a collection of smaller servers to work as a larger SMP.
Others have tried and failed to deliver this kind of solution, but ScaleMP seems to have actually succeeded, with a claimed 200 customers and expectations of somewhere between 250 and 300 next year.
Their vSMP product comes in two flavors, one that allows a cluster of machines to look like a single system for purposes of management and maintenance while still running as independent cluster nodes, and one that glues the member systems together to appear as a single monolithic SMP.
Does it work? I haven’t been able to verify their claims with actual customers, but they have been selling for about five years, claim over 200 accounts, with a couple of dozen publicly referenced. All in all, probably too elaborate a front to maintain if there was really nothing there. The background of the principals and the technical details they were willing to share convinced me that they have a deep understanding of the low-level memory management, prefectching, and caching that would be needed to make a collection of systems function effectively as a single system image. Their smaller scale benchmarks displayed good scalability in the range of 4 – 8 systems, well short of their theoretical limits.
My quick take is that the software works, and bears investigation if you have an application that:
Either is certified to run with ScaleMP (not many), or one where that you control the code.
You understand the memory reference patterns of the application, and