English

The Arrival of HPC Networking at Petascale

Posted on Jun 3, 2024 by
100

High-Performance Computing (HPC) is pivotal in advancing scientific research and industrial applications, pushing the boundaries of what is computationally possible. At the forefront of these advancements is Petascale computing, defined by its ability to perform quadrillions (10^15) of calculations per second. This article delves into the arrival of HPC networking at Petascale, exploring HPC's foundation, current state, network equipment, and future trends.

The Foundation of Petascale Computing

Defining Petascale

Petascale computing refers to systems capable of performing at least one petaflop, or one quadrillion floating-point operations per second (PetaFLOPS). This leap in computational power is essential for tackling the most complex and data-intensive problems across various domains.

Fundamentals of HPC Networking

HPC networking involves connecting numerous computing nodes within an HPC environment to facilitate rapid data transfer and communication. It ensures that data moves quickly and efficiently between different parts of a system, which is crucial for achieving Petascale performance. Read the post What Is High-Performance Computing (HPC)? to find more details about HPC.

Historical Background

The journey to Petascale computing has been marked by significant milestones and technological breakthroughs. From the early days of cluster computing to the sophisticated interconnect technologies of today, HPC networking has evolved to meet the escalating demands of computational power and data throughput.

The Current State of HPC Networking

Data and Compute Intensive HPC Workloads

HPC applications, especially those using large language models, require significant computational power. These models operate on large, sparse matrices distributed across numerous processors, necessitating efficient data exchange and reducing network slowdowns that could impact performance. A subpar network can cause a slowdown that has a significant negative effect on application performance, leading to unproductive wait states, a 30% or greater loss in processor performance, and a waste of costly GPU efficiency. A state-of-the-art, scalable HPC network is essential.

The Role of Ethernet in HPC Networking

A dedicated HPC network with wire-rate transmission of huge and coordinated bursts of data is required to avoid these idle moments with high processor density, enhancing performance at 400/800G rates. This kind of performance was previously exclusive to InfiniBand and other specialist HPC networks. Ethernet and IP may now be utilized as the transport fabric without any overhead thanks to RDMA Ethernet NICs and RoCE (RDMA over converged Ethernet). As demonstrated by the HPC network design principles below, Ethernet has several benefits for HPC networking, including industry-wide compatibility, a large installed base, standard economics, and merchant silicon support.

HPC Network Design Guidelines

Network Equipment for HPC Networking

FS N9550 Series HPC Spine

The data center N9550 series switches, the leading HPC spine, optimally capture massive performance versus any HPC task. They provide an unparalleled blend of high radix, lossless, high bandwidth fabric linking thousands of GPUs at 400Gbps. They discuss important traits, such as:

  • Broadcom BCM56990 Chip, QSFP-DD 400G Speeds (N9550-32D with 32 x 400Gb; N9550-64D with 64 x 400Gb)

  • PicOS® Delivers Full Backward Compatibility

  • Improve Deployment with AmpCon™'s GUI-based Automation

  • Support MLAG, LLDP, Voice VLAN

  • 3+1 Hot-swappable Power Supplies, 5+1 Smart Fans (Except N9550-32D with 1+1Hot-swappable Power Supplies)

FS AmpCon™ Management Platform

For FS PicOS®, AmpCon™TM—short for "amplified control"—is a management platform that automates Zero Touch Provisioning (ZTP), deployment, setup, and lifecycle management. It is installed as a software appliance with a Web user interface (UI) and may operate in a cloud or data center virtual machine (VM). For thousands of PicOS® software switches, AmpCon™TM offers scalability and remote deployment capabilities, which are critical to handling enormous workloads.

Conclusion

The arrival of HPC networking marks a transformative moment in the field of computational science and technology. By providing unparalleled speed, scalability, and reliability, HPC networks are enabling breakthroughs across a wide range of industries. As we look to the future, continuous innovation and technological advancements will be key to unlocking the full potential of HPC networking. Businesses and research institutions must stay informed and invest in these cutting-edge technologies to maintain a competitive edge in the rapidly evolving digital landscape. Welcome to the new wave of petascale HPC networking!

You might be interested in

Knowledge
See profile for Virginia.
Virginia
What Is High-Performance Computing (HPC)?
Jun 16, 2022
4.7k
Blog
See profile for George.
George
Introducing InfiniBand HDR Products for HPC
Dec 30, 2023
999
Knowledge
Knowledge
Knowledge
See profile for Sheldon.
Sheldon
Decoding OLT, ONU, ONT, and ODN in PON Network
Mar 14, 2023
402.4k
Knowledge
See profile for Irving.
Irving
What's the Difference? Hub vs Switch vs Router
Dec 17, 2021
373.0k
Knowledge
See profile for Sheldon.
Sheldon
What Is SFP Port of Gigabit Switch?
Jan 6, 2023
348.8k
Knowledge
Knowledge
See profile for Migelle.
Migelle
PoE vs PoE+ vs PoE++ Switch: How to Choose?
May 30, 2024
431.3k