By Michael McNerney, Supermicro
With the advancements in computer technology, we are seeing some of the most powerful computers in history – but they are also the hottest. With the latest generation of CPUs from Intel and AMD, we’ve seen immense improvements in processing performance – but this also has meant increased power draw and heat generated. As CPUs and GPUs are improving in performance, traditional cooling approaches are struggling to keep up with the heat produced by newer chips. In part one of this article series, we will explore the various factors driving liquid cooling as the new way forward.
As some industry leaders have noted, this performance improvement has come at the cost of notably lower power efficiency. The most recent CPUs have a Thermal Design Profile (TDP) of 270 to 280 Watts, while the latest GPUs are drawing almost twice that at measurements up to 500 Watts. This means that today’s servers can easily require over 2000 Watts for the computing system alone. Once modern storage and memory are also factored in, the power draw is quite considerable – to the point that it’s a significant cooling problem for most data centers. Ambient air-cooling methods have limits as to how much and how fast they can dissipate heat. And this isn’t a problem that can be solved simply by speeding up the cooling fans.
As explained by the Uptime Institute, “heat gain through a server is measured by the temperature difference between the server intake and server exhaust.” If this coefficient is too high, air-conditioning systems struggle to keep up with the volume of air needed. This eventually results in warmer air being returned to the inlet system, leading to gradual overheating. It’s well known that processors running at temperatures of 80 degrees C (176 degrees F) for sustained workloads will eventually damage the electronics.
Since future generations of CPUs and GPUs are only going to intensify this issue, enterprises and manufacturers are exploring new technologies, designs, and methodologies to prevent critical data center applications from failing. Alternative cooling systems are in high demand to effectively keep up with the heat produced by these new CPUs and GPUs.
Liquid Is Better Than Air
Liquid cooling in particular has been picked up by most of the major industry players, including Microsoft and Hewlett Packard Enterprise, who are investing in new data centers. This is a fairly new development, largely driven by the growing demand for better cooling options. Investment and development in liquid cooling hardware have expanded the supply chain dramatically, making liquid cooling less expensive and more available than ever before. But this isn’t the most important aspect.
The most important factor is simply that liquids can remove heat more efficiently than air, thanks to having higher thermal heat conductivity. Additionally, due to how the design of liquid cooling systems are more contained, where heat is dispersed into the liquid in one area and then ultimate dispelled out of the liquid (and the system) in another area – liquid cooling is better able to relocate heat outside of the system, instead of displacing it in the server case like air cooling often does.
While the capital expenditure costs are far higher with liquid cooling, considering the need to build a custom, water-tight cooling system for each rack with the necessary reservoirs and pumps, the operating costs are far lower. Traditional air-conditioned setups tend to spend immense amounts of power simply on cooling, making them very inefficient from a power cost to processing performance perspective.
In addition, using liquid cooling can actually increase the performance of CPUs compared to an air-cooled solution, as CPUs will not have to throttle back when thermal limits are detected. This is especially important if you are going to be overclocking your hardware, liquid cooling allows you to reach levels not accessible with fan cooling.
Going With The Flow
Ultimately, compared to the traditional air-cooling approaches, newer liquid cooling setups are more power efficient, sustainable, small, and quieter – in addition to being able to cool more parts faster. This also can lead to improvements in compute performance. With a liquid cooling setup where thermal throttling isn’t as much of a concern, data center operators can more safely overclock their hardware and reach levels of workload performance they couldn’t with prior fan cooling setups.
For all of these reasons, the industry as a whole is quickly adopting liquid cooling designs for newer data centers. While air cooling has been satisfactory for the past generations of microprocessors, it’s clear that liquid cooling will be required for enterprises wanting to maximize performance. But even with liquid cooling, there are different approaches that each have their benefits and downsides. In part two of this article series, we will dive into the different liquid cooling options on the market along with some considerations of the pros and cons of each.
About The Author
Michael McNerney is VP of Marketing and Network Security at Supermicro. Michael has over two decades of experience working in the enterprise hardware industry, with a proven track record of leading product strategy and software design. Prior to Supermicro, he also held leadership roles at Sun Microsystems and Hewlett-Packard.