Case Study

Case Study: German University

Sustainably Advancing HPC at German University with Liquid Cooled High-Density Solution

Industry: Research, academics, science and engineering

Customer: an elite German university

Products: K24 with liquid cooling

Introduction

The university client is one of the world’s top universities in the domain of science and technology, particularly distinguished for its excellence in science, technology, and engineering. It holds a prominent position as one of Germany’s elite universities.

The university’s IT center houses one of the fastest supercomputers in the country. As part of the National HPC Centers (NHR) alliance, this state-of-the-art facility plays a vital role in supporting researchers in not just its own university, but across all universities in Germany. With the escalating demand for computing power in recent years, conventional platforms struggling to meet the increasing demand for computing power in terms of computational density, energy ratio, scalability, and flexibility. and the challenge of matching the linearity. The pressing challenge lies in achieving a harmonious balance between scaling and performance linearity, while at the same time taking into account the balance of capacity, performance, stability, resilience, and manageability, and the introduction of new architectures is imminent.

Challenges

1. Growing compute power demands of leading HPC applications

The existing infrastructure of the university confronts the challenge of keeping up with the escalating demand for computing power. The surge in computational requirements, particularly in fields such as materials science, life science, CAE simulation, fluid mechanics, and other academic programs, has outpaced the capabilities of the existing HPC system at the university. The customer aims to build a green and energy-efficient HPC platform that will not only fulfill the growing demand for computational power but also provide a flexible and easily manageable solution.

2. Energy, cooling, and sustainability concerns of HPC

The main challenge is improving the energy efficiency compared to the existing platform. Customers have been pursuing higher computing power and node density of HPC systems. However, this pursuit has introduced challenges related to power consumption, heat generation, and escalating cooling expenses. This amplification will also cause higher carbon emissions, resulting in all these factors becoming the limiting factor in the growth of their HPC systems which severely increases the customer’s concerns.

As energy prices in Europe continue to rise and liquid cooling technology has already matured enough, customers are turning to more energy-efficient liquid cooling solutions to keep energy costs down.

Solution

To meet the customer’s need for high performance, low power consumption, scalable and stable computing nodes for the HPC system, Aivres leveraged the liquid-cooled high-density K24 server. Through extensive testing, K24 exhibits superior energy efficiency, making it the optimal choice for building liquid-cooled HPC systems. The cluster comprised about 700 computing nodes, over 30,000 processor cores, and delivers 5,000 TFLOPS performance, which fully satisfies the demand for rigorous computing performance in materials science, life science, CAE simulation, fluid mechanics, and other related applications.

1. High density and performance

K24 supports four dual-socket compute nodes within a compact 2U space. Each of these nodes is powered by two Intel Xeon Scalable processors, supporting up to eight 350W CPUs in its space efficient, rack mountable form factor.

2. Energy efficiency

The server utilizes advanced direct warm-water cooling technology, which covers the processor, memory, and VR within each node with cold plates. The integration of liquid cooling has significantly boosted cooling efficiency, resulting in an impressive improvement of up to 80%.

3. High reliability

The nodes of liquid-cooled high-density server K24 incorporates liquid leakage detection technology to ensure the reliability of the overall system. In the event of a leak, the nodes can be automatically powered off to avoid accidents.

Results and Impact

The customer’s HPC system is outfitted with the K24 high-density, high-performance servers operating with advanced direct warm-water cooling technology. The liquid-cooled servers reduce data center cooling costs by 30% to 40% compared to traditional air-cooled technology and obtain impressive PUE ratings below 1.1, which not only facilitates heightened data center density without increasing the additional air-conditioning in the computer room but also ensures sustainability. Liquid cooling allows servers to operate at lower temperatures and helps to prolong the lifespan of components and servers. Moreover, the operation of water-cooled servers is less mechanically intensive than that of high-rotation fans and air conditioners based on traditional air-cooled technology. This not only mitigates noise pollution but also leads to reductions in greenhouse gas emissions.

After the system went live, it greatly simplified equipment variety and management complexity, improved operation and maintenance efficiency, and saved the customer 30% in operation and maintenance costs. The utilization of building block construction, coupled with a linear, on-demand rapid expansion approach, empowered the customer to substantially enhance deployment efficiency.

The resulting solution is highly integrated, reducing operating costs while ensuring the same data reliability and availability as the traditional architecture. The comprehensive management of the entire platform addresses challenges associated with both management complexities and IT issues arising from rapid business growth. It also enables parallel computing for a range of multidisciplinary tasks such as  material science, life science, CAE simulation, fluid mechanics, etc., to enhance the depth and breadth of comprehensive applications.