English

SuperNIC: the Network Accelerator for AI

Posted on Feb 2, 2024 by
462

As AI complexity and scale grow, traditional networking solutions tend to fail to meet these advanced systems' data-intensive requirements. To address the issues faced by AI workloads, SuperNIC was created. In this article, we will look at SuperNIC's transformational capabilities, exploring how it revolutionizes network performance and opens up new frontiers in AI-driven innovation.

What Is a SuperNIC?

SuperNIC represents an emerging category of network accelerators meticulously crafted to enhance the performance of hyper-scale AI workloads within Ethernet-based cloud environments. It delivers unparalleled network connectivity tailored for GPU-to-GPU communication, attaining speeds of up to 400Gb/s through the utilization of remote direct memory access (RDMA) over converged Ethernet (RoCE) technology.

SuperNIC guarantees the efficient and rapid execution of AI workloads, establishing them as foundational elements for propelling the future of AI computing. This strength comes from SuperNIC’s unique attributes:

  • Leveraging real-time telemetry data and network-aware algorithms, advanced congestion control is implemented to effectively manage and prevent congestion within AI networks.

  • High-speed packet reordering ensures the reception and processing of data packets in the original transmission order, preserving the sequential integrity of data flow.

  • Featuring a power-efficient, low-profile design, SuperNIC adeptly accommodates AI workloads within constrained power budgets.

  • The capability for programmable computing on the input/output (I/O) path allows for the customization and extensibility of network infrastructure in AI cloud data centers.

  • Comprehensive AI optimization across the entire stack, encompassing computing, networking, storage, system software, communication libraries, and application frameworks.

AI Promotes the Development of SuperNIC

The success of artificial intelligence is intricately tied to GPU-accelerated computing, essential for processing vast datasets, training expansive AI models, and facilitating real-time inference. While this enhanced computing power has introduced novel possibilities, it has simultaneously posed challenges to conventional networks.

Traditional networking, the foundational technology supporting Internet infrastructure, was initially developed to provide broad compatibility and connect loosely coupled applications. Its design did not anticipate the rigorous computational demands posed by contemporary AI workloads, characterized by tightly coupled parallel processing, swift data transfers, and distinct communication patterns. The traditional network interface cards (NICs) were designed for general-purpose computing, universal data transmission, and interoperability, lacking the requisite features and capabilities for efficient data transfer, low latency, and the deterministic performance crucial for AI tasks. In response to the demands of current AI workloads, SuperNICs have emerged.

SuperNIC Is More Suitable for AI Computing Environments than DPU

Data processing units (DPUs) deliver many advanced features, offering high throughput, low-latency network connectivity and more. Since the introduction in 2020, DPUs have gained popularity in cloud computing, primarily due to their capacity to offload, accelerate, and isolate data center infrastructure processing. Although DPUs and SuperNICs have sharing capabilities, SuperNICs are specifically designed to accelerate AI networks. The several main advantages are given below:

  • The 1:1 ratio of GPUs to SuperNICs in a system can considerably improve AI workload efficiency, resulting in increased productivity and better results for businesses.

  • SuperNICs provide 400Gb/s of network capacity per GPU, outperforming DPUs for Distributed AI training and inference communication flows.

  • To accelerate networking for AI cloud computing, SuperNICs use less computing power than DPUs, which require a significant amount of computing resources to offload applications from the host CPU.

  • he lowered computing requirements also result in lower power consumption, which is extremely useful for multi-SuperNIC systems.

  • SuperNIC's dedicated AI networking capabilities include adaptive routing, out-of-order packet handling, and optimized congestion control, all of which offer to accelerate Ethernet AI cloud environments.

 
BlueField-3 DPU
BlueField-3 SuperNIC
Mission
  • Cloud infrastructure processor

  • Offload, accelerate, and isolate data center infrastructure

  • Optimized for N-S in GPU-class systems

  • Accelerated networking for Al computing

  • Best-in-class RoCE networking

  • Optimized for E-W in GPU-class systems

Shared Capabilities
  • VPC network acceleration

  • Network encryption acceleration

  • Programmable network pipeline

  • Precision timing

  • Platform security

Unique Capabilities
  • Powerful computing

  • Secure, zero-trust management

  • Data storage acceleration

  • Elastic infrastructure provisioning

  • 1-2 DPUs per system

  • Powerful networking

  • Al networking feature set

  • Full-stack NVIDIA Al optimization

  • Power-efficient, low-profile design

  • Up to 8 SuperNICs per system

 

Conclusion

The SuperNIC is a sort of network accelerator for AI data centers that provides reliable and smooth connectivity amongst GPU servers, creating a cohesive environment for executing advanced AI workloads and contributing to the continued advancement of AI computing.

You might be interested in

Knowledge
Knowledge
Knowledge
See profile for Sheldon.
Sheldon
Decoding OLT, ONU, ONT, and ODN in PON Network
Mar 14, 2023
386.2k
Knowledge
See profile for Irving.
Irving
What's the Difference? Hub vs Switch vs Router
Dec 17, 2021
367.6k
Knowledge
See profile for Sheldon.
Sheldon
What Is SFP Port of Gigabit Switch?
Jan 6, 2023
335.6k
Knowledge
See profile for Migelle.
Migelle
PoE vs PoE+ vs PoE++ Switch: How to Choose?
Mar 16, 2023
420.5k
Knowledge
Knowledge
Knowledge
Knowledge
See profile for Moris.
Moris
How Much Do You Know About Power Cord Types?
Sep 29, 2021
294.7k