A Peek Behind The Wizard's Curtain

The world of hyper scale web properties has been shrouded in secrecy, with major players like Google and Amazon releasing only tantalizing dribbles of information about their infrastructure architecture and facilities, on the presumption that this information represented critical competitive IP. In one bold gesture, Facebook, which has certainly catapulted itself into the ranks of top-tier sites, has reversed that trend by simultaneously disclosing a wealth of information about the design of its new data center in rural Oregon and contributing much of the IP involving racks, servers, and power architecture to an open forum in the hopes of generating an ecosystem of suppliers to provide future equipment to themselves and other growing web companies.

The Data Center

By approaching the design of the data center as an integrated combination of servers for known workloads and the facilities themselves, Facebook has broken some new ground in data center architecture with its facility.

At a high level, a traditional enterprise DC has a utility transformer that feeds power to a centralized UPS, and then power is subsequently distributed through multiple levels of PDUs to the equipment racks. This is a reliable and flexible architecture, and one that has proven its worth in generations of commercial data centers. Unfortunately, in exchange for this flexibility and protection, it extracts a penalty of 6% to 7% of power even before it reaches the IT equipment.

The first major change Facebook made to traditional architectures was to eliminate the UPS (which Google has done in a previous generation of its own servers) and the PDUs and elected to run power directly from the input 480V supply to the servers, which required that they work with various power supply vendors to design a custom power supply for their servers. For backup power they use a modular 48V DC battery backup unit that supplies up to six through a DC-DC converter in each server.

Cooling

Facebook has taken advantage of the unique local environment in Prineville, Oregon in designing the cooling system for this new 150,000 square foot DC (with another similar sized unit planned for next year). Using purely evaporative cooling, the intake air is pressurized via baffled cold aisles, passed through the servers, and exhausted to the outside with no chilled water and no heat exchangers.

The Servers

Facebook has the luxury, not afforded to the typical enterprise data center, of knowing its workloads and being able to custom-configure servers for its exact requirements. Currently the primary workloads consist of:

  •  Web tier – CPU intensive, needs high power CPU (key metric is hits per server per watt). These servers use Intel Xeon X5650 with 12 GB, dual socket 8-core.
  •  Memcache tier – This workload needs less CPU and more memory (key metric $/GB). These units use AMD Magny-Cours 8-core CPUs populated with a single socket. Each server can have up to 6 local disks.

Facebook developed their own motherboard specification and are currently having it built by Quanta. By tightly specifying components, layout, PCB design, I/O and network controllers I/O, RAM connectors, as well as other system components such as VRMs and power supplies (at 94.8 efficiency at full load). This tight specification was coupled with what they describe as a "vanity free" mechanical design – lightweight cold-rolled sheet metal with an emphasis on easy in-rack serviceability.

 The Results

 The results speak for themselves – a high volume data center with a claimed PUE of 1.07, certainly one of the most efficient large data centers in the world. While this design is certainly not transportable in its entirety to a commercial data center, lessons learned from its power engineering, modularity, and draconian standardization of servers, racks, and switches are valuable.

 The Open Compute Project

Even more valuable is Facebook’s decision to publish its server, rack, and power specifications as part of the “Open Compute Project,” an initiative they are promoting to make these specifications available to users and vendors in the hopes of creating an ecosystem around these stripped down cost-optimized servers and associated infrastructure. Facebook claims it is to encourage the development of new web companies by making it easier for them to build world-class infrastructure. Even if their true motivations are also weighted by a less altruistic goal of further lowering their costs by creating a community of multiple competing suppliers, Facebook deserves credit for sharing their IP with a wider and in some cases potentially competitive world. In an initial token of its appeal, Facebook's program has attracted the attention of at least HP, which announced that they have developed an HP-built auto-ranging, highly efficient 277-volt power supply that apparently conforms to Facebook's power specifications.

For some additional thoughts on the implications of this collaboration, check out his interesting blog by John Fruehe from AMD.