English

Revolutionizing AI Servers: Unraveling the Innovations in Interface Interconnection Chip Technology

Posted on Jan 19, 2024 by
866

According to TrendForce data, the shipment of AI servers is approximately 130,000 units, accounting for about 1% of the total global server shipments. With major manufacturers such as Microsoft, Meta, Baidu, and ByteDance successively launching products and services based on generative AI, there has been a significant increase in order volume. Forecasts indicate that, driven by ongoing demand from applications like ChatGPT, the AI server market is expected to maintain a compound annual growth rate of 12.2% from 2023 to 2027. Against this background, the development of AI servers is particularly eye-catching.

DGX H100: Pioneering AI Advancement

Launched in 2022, the DGX H100 stands as the most recent version of NVIDIA's DGX system and forms the core of the NVIDIA DGX SuperPOD. Featuring 8 H100 GPUs and a remarkable 640 billion transistors, DGX servers deliver six times the AI performance of the previous generation, particularly excelling at the new FP8 precision. This advanced system provides an impressive 900GB/s of bandwidth, showcasing a significant leap forward in AI capabilities.

DGX H100 Server Architecture

The DGX H100 server incorporates IP network cards, functioning both as network cards and PCIe Switch expansions, utilizing the PCIe 5.0 standard. Additionally, the server includes CX7 in the form of 2 cards, each containing 4 CX7 chips and providing 2 800G OSFP optical module ports. For GPU interconnection (H100), NVSwitch chips play a crucial role. Each GPU extends 18 NVLinks outward, achieving a bidirectional bandwidth of 50 GB/s per link, totaling 900GB/s bidirectional bandwidth. This is distributed across 4 onboard NVSwitches, with each NVSwitch corresponding to 4-5 OSFP optical modules. Each OSFP optical module uses 8 optical channels, with a transmission rate of 100Gbps per channel, so the total rate reaches 800Gbps. This configuration enables high-speed data transmission.

Interconnection of Components such as CPU, GPU: PCIe Switch, Retimer Chip

Evolution of PCIe Switch Technology: Overcoming Channel Limitations

The PCIe Switch (PCIe hub), is a crucial component that facilitates the connection of PCIe devices using the PCIe communication protocol. It provides expansion and aggregation capabilities, allowing multiple devices to be connected to a PCIe port. This is particularly beneficial in overcoming challenges associated with limited PCIe channel availability. Currently, PCIe Switches are widely used in traditional storage systems and are gaining popularity in various server platforms, contributing to improved data transmission speeds within these systems.

The progression of PCIe bus technology entails a gradual augmentation in the speed of PCIe Switch across successive generations. Initially presented by Intel in 2001 as the third-generation I/O technology under the moniker "3GIO," it underwent a renaming to "PCI Express" in 2002 following assessment by the PCI Special Interest Group (PCI-SIG). A notable milestone occurred with the formal introduction of PCIe 1.0 in 2003, signifying support for a per-channel transmission rate of 250MB/s and an aggregate transmission rate of 2.5 GT/s. In 2022, PCI-SIG officially released the PCIe 6.0 specification, increasing the total bandwidth to 64 GT/s.

Evolution of PCIe Switch Technology

Dominating Trends in the PCIe Retimer Industry

Within AI servers, ensuring signal quality during the connection of GPUs and CPUs requires the inclusion of a minimum of one Retimer chip. Several AI servers opt for the utilization of multiple Retimer chips, as exemplified by Astera Labs, which integrates four Retimer chips within its AI accelerator configuration.

Retimer Chip Configuration

The PCIe retimer market currently has great potential, with three leading manufacturers and many potential competitors. Presently, Parade Technologies, Astera Labs, and Montage Technology stand out as the primary players in this thriving market, holding significant positions. Notably, Montage Technology, being an early adopter of PCIe deployment, is the sole mainland China supplier capable of large-scale production of PCIe 4.0 Retimers. Furthermore, Montage Technology is making steady progress in the development of PCIe 5.0 Retimers.

PCIe retimer vendors

Moreover, chip manufacturers like Renesas, TI, and Microchip Technology actively participate in the development of PCIe Retimer products. As per the information available on their official websites, Renesas offers two PCIe 3.0 Retimer products, namely 89HT0816AP and 89HT0832P. TI provides a 16Gbps 8-channel PCIe 4.0 Retimer – DS160PT801. Besides, Microchip Technology introduced the XpressConnect series of Retimer chips in November 2020, designed to support the 32GT/s rate of PCIe 5.0.

Interconnecting GPUs: NVLink and NVSwitch

Major chip manufacturers around the world attach great importance to promoting high-speed interface technology. Among them, NVIDIA’s NVLink, AMD’s Infinity Fabric and Intel’s CXL have made significant contributions.

NVLink is a high-speed interconnect technology developed by NVIDIA. It is designed to accelerate the data transmission speed between CPU and GPU, GPU and GPU, and improve system performance.Over the period from 2016 to 2022, NVLink has undergone advancements, reaching its fourth generation. In 2016, NVIDIA introduced the first generation of NVLink with the release of the Pascal GP100 GPU. NVLink employs High-Speed Signaling interconnect (NVHS) technology, primarily facilitating signal transmission between GPUs and between GPUs and CPUs. GPUs transmit differential impedance electrical signals encoded in NRZ (Non-Return-to-Zero) form. The first-generation NVLink single link achieves a bidirectional bandwidth of 40 GB/s, and a single chip can support four links, resulting in a total bidirectional bandwidth of 160 GB/s.

NVLink Comparsion

NVLink Advancements: Development Through Different Periods

NVLink technology has seen multiple iterations, driving innovation in high-speed interconnection. In 2017, the second generation of NVLink, based on the Volta architecture, was introduced. It achieves a bidirectional bandwidth of 50 GB/s per link and supports six links per chip, totaling 300 GB/s of bidirectional bandwidth. In 2020, the third generation, based on the Ampere architecture, was released with similar specifications, reaching a total bidirectional bandwidth of 600 GB/s. The latest development came in 2022 with the fourth generation, based on Hopper architecture. This iteration shifted to PAM4 modulated electrical signals, maintaining a 50 GB/s bidirectional bandwidth per link and supporting 18 links per chip, resulting in a total bidirectional bandwidth of 900 GB/s.

NVSwitch Development Enabling High-Performance GPU Interconnects

In 2018, NVIDIA introduced the initial iteration of NVSwitch, offering a resolution to enhance bandwidth, diminish latency, and facilitate communication among multiple GPUs within servers. Fabricated utilizing TSMC's 12nm FinFET process, the first-generation NVSwitch boasted 18 NVLink 2.0 interfaces. Through the deployment of 12 NVSwitches, a server could accommodate and optimize the interconnection speed for 16 V100 GPUs.

NVSwitch Comparsion

At present, NVSwitch has progressed to its third generation, constructed using TSMC's 4N process. Each NVSwitch chip is equipped with 64 NVLink 4.0 ports, facilitating a communication speed of 900GB/s between GPUs. The GPUs interconnected via the NVLink Switch can collectively function as a singular high-performance accelerator with deep learning capabilities.

Summary

The development of interface interconnect chip technologies such as PCIe chips, Retimer chips, and NVSwitch has significantly enhanced the interaction capabilities between CPUs and GPUs, as well as among GPUs. The interplay of these technologies highlights the dynamic landscape of artificial intelligence servers, contributing to the advancement of high-performance computing.

You might be interested in

Knowledge
Knowledge
Knowledge
See profile for Sheldon.
Sheldon
Decoding OLT, ONU, ONT, and ODN in PON Network
Mar 14, 2023
386.2k
Knowledge
See profile for Irving.
Irving
What's the Difference? Hub vs Switch vs Router
Dec 17, 2021
367.6k
Knowledge
See profile for Sheldon.
Sheldon
What Is SFP Port of Gigabit Switch?
Jan 6, 2023
335.5k
Knowledge
See profile for Migelle.
Migelle
PoE vs PoE+ vs PoE++ Switch: How to Choose?
Mar 16, 2023
420.5k
Knowledge
Knowledge
Knowledge
Knowledge
See profile for Moris.
Moris
How Much Do You Know About Power Cord Types?
Sep 29, 2021
294.7k