English

InfiniBand Insights: Powering High-Performance Computing in the Digital Age

Posted on Dec 22, 2023 by
994

Since the onset of the 21st century, propelled by the surging popularity of cloud computing and big data, the rapid evolution of data centers has become increasingly evident. In this dynamic landscape, InfiniBand has emerged as a pivotal technology, playing a crucial role in the heart of data centers. Notably, as of 2023, the ascendancy of large AI models, exemplified by innovations like ChatGPT, has propelled InfiniBand into an even more prominent position. This heightened attention is attributed to the fact that the network underpinning GPT models is constructed on the foundation of InfiniBand.

But what precisely is InfiniBand technology, and what attributes contribute to its widespread adoption? Moreover, why is there an ongoing discourse surrounding the "InfiniBand vs. Ethernet" debate? This comprehensive article aims to address each of these questions, offering valuable insights into the intricacies of InfiniBand technology and its significance in the ever-evolving landscape of high-performance computing.

infiniband-vs-ethernet

The Evolutionary Journey of InfiniBand Technology

InfiniBand (IB), a robust communication protocol, finds its roots intertwined with the evolution of computer architecture. The foundation of modern digital computers rests upon the von Neumann architecture, a structure featuring essential components such as CPUs encompassing the arithmetic logic unit and control unit, memory inclusive of RAM and hard disk, and I/O devices.

Venturing into the early 1990s, the computing landscape witnessed a surge in the demand for supporting an expanding array of external devices. Responding to this need, Intel emerged as a trailblazer by introducing the Peripheral Component Interconnect (PCI) bus design into the standard PC architecture. This innovative step marked a pivotal moment in the trajectory of computer evolution, laying the groundwork for the eventual emergence of the potent communication protocol we now recognize as InfiniBand.

Peripheral Component Interconnect

Subsequently, the Internet underwent a rapid development phase, accompanied by the flourishing growth of online businesses and user bases, which in turn imposed substantial challenges on the capacity of IT systems.

During this period, despite the remarkable advancements in components like CPUs, memory, and hard drives, propelled by the momentum of Moore's Law, the PCI bus faced a lag in upgrades. This slower pace of development significantly constrained I/O performance, emerging as a bottleneck for the entire system.

In response to this bottleneck, a collaborative effort led by industry giants such as Intel, Microsoft, and SUN gave rise to the "Next Generation I/O (NGIO)" technology standard. Simultaneously, IBM, Compaq, and Hewlett-Packard took charge of developing "Future I/O (FIO)." Notably, these three entities jointly pioneered the creation of the PCI-X standard in 1998.

In a pivotal turn of events, the FIO Developers Forum and NGIO Forum merged, laying the foundation for the establishment of the InfiniBand Trade Association. This collaborative effort paved the way for the official release of the 1.0 version of the InfiniBand Architecture Specification in the year 2000. In essence, InfiniBand's inception aimed to supplant the PCI bus. Introducing the RDMA protocol, InfiniBand offered lower latency, higher bandwidth, and enhanced reliability, thereby empowering more potent I/O performance.

In May 1999, a group of former Intel and Galileo Technology employees came together to establish Mellanox, a chip company based in Israel. Following its founding, Mellanox aligned itself with NGIO, and when NGIO and FIO merged, Mellanox seamlessly transitioned into the InfiniBand ecosystem. The year 2001 marked a milestone as Mellanox unveiled its inaugural InfiniBand product.

However, the landscape of the InfiniBand community underwent a notable transformation in 2002. Intel, a key player, abruptly redirected its attention toward developing PCI Express (PCIe), officially launched in 2004. Simultaneously, another major contributor, Microsoft, withdrew from active involvement in InfiniBand development. While some entities like SUN and Hitachi persevered, the departure of industry giants cast shadows over the trajectory of InfiniBand's development.

A turning point came in 2003 when InfiniBand found a new application domain – computer cluster interconnectivity. In the same year, Virginia Tech constructed a cluster based on InfiniBand technology, securing the third position in the TOP500 list, a global ranking of supercomputers.

In 2004, another noteworthy InfiniBand non-profit organization emerged – the Open Fabrics Alliance (OFA). OFA and IBTA maintain a collaborative relationship, with IBTA focusing on the development, maintenance, and enhancement of InfiniBand protocol standards, while OFA takes charge of developing and maintaining both the InfiniBand protocol and higher-level application APIs.

OFA

In 2005, InfiniBand found another application scenario—connecting storage devices. This period also witnessed the popularity of InfiniBand and Fibre Channel (FC) as Storage Area Network (SAN) technologies, bringing increased awareness to InfiniBand technology.

As InfiniBand gained traction, its user base grew, and by 2009, 181 systems in the TOP500 list were utilizing InfiniBand technology, although Gigabit Ethernet remained mainstream with 259 systems.

Post-2012, driven by the escalating demands of high-performance computing (HPC), InfiniBand technology continued to progress, steadily increasing its market share. In 2015, it surpassed 50% share in the TOP500 list for the first time, marking a significant milestone. InfiniBand became the preferred internal interconnect technology for supercomputers.

In response to InfiniBand's progress, Ethernet underwent developments. In April 2010, IBTA introduced RoCE (RDMA over Converged Ethernet), "porting" RDMA technology from InfiniBand to Ethernet. By 2014, a more mature version, RoCE v2, was proposed. With RoCE v2, Ethernet significantly closed the technological performance gap with InfiniBand, leveraging its cost and compatibility advantages.

The chart below illustrates the technology shares in the TOP500 list from 2007 to 2021, showcasing the dynamic landscape of high-performance computing technologies

2007-2021-networking-top500

As evident in the graph, the ascent of 25G and higher-speed Ethernet (illustrated by the dark green line) commenced in 2015, swiftly gaining industry favor and momentarily overshadowing InfiniBand. The ascent of large AI language models, exemplified by GPT-3, has triggered an exponential surge in the societal demand for high-performance computing and intelligent computing.

To meet the staggering computational demands imposed by large AI language models like GPT-3, the indispensable backbone is high-performance computing clusters. When it comes to performance, InfiniBand stands out as the preferred choice for such clusters.

In the realm of high-performance networking, the battleground is primarily between InfiniBand and high-speed Ethernet, with both sides demonstrating comparable prowess. Manufacturers endowed with ample resources often opt for InfiniBand, while those prioritizing cost-effectiveness tend to gravitate towards high-speed Ethernet.

Other technologies, such as IBM's BlueGene, Cray, and Intel's OmniPath, linger as alternatives in the second tier of options. The intricate interplay of these technologies reflects the dynamic landscape of high-performance computing.

The Technical Principles of InfiniBand

After tracing the development history of InfiniBand, a deeper exploration into its working principles unveils why it surpasses traditional Ethernet in terms of performance and latency. How does InfiniBand achieve such low latency and high performance?

Pioneering Advancement: RDMA

As highlighted earlier, a standout feature of InfiniBand is its early integration of the Remote Direct Memory Access (RDMA) protocol.

In the conventional TCP/IP framework, data travels from the network card to the main memory and then undergoes an additional transfer to the application's storage space. Conversely, data from the application space follows a similar route—moving from application space to main memory before being transmitted through the network card to the Internet.

This intricate I/O operation necessitates intermediary copying in the main memory, elongating the data transfer path, imposing a load on the CPU, and introducing transmission latency.

ethernet-vs-rdma

RDMA serves as a technology that effectively "eliminates intermediaries." Operating with a kernel bypass mechanism, RDMA facilitates direct data reads and writes between applications and the network card, minimizing data transmission latency within servers to nearly 1 microsecond.

In addition, RDMA's zero-copy mechanism permits the receiving end to directly access data from the sender's memory, circumventing the need for main memory involvement. This results in a substantial reduction in CPU burden, significantly enhancing overall CPU efficiency.

As emphasized earlier, the widespread adoption of InfiniBand can be largely credited to the transformative impact of RDMA on data transfer efficiency.

InfiniBand Network Architecture

The network topology structure of InfiniBand is visually represented in the diagram below:

infiniband-network-topology

InfiniBand is built on a channel-based architecture, featuring four primary components:

  • HCA (Host Channel Adapter)

  • TCA (Target Channel Adapter)

  • InfiniBand links (connecting channels, ranging from cables to fibers, and even on-board links)

  • InfiniBand switches and routers (integral for networking)

Channel adapters, specifically HCA and TCA, play a crucial role in establishing InfiniBand channels, ensuring both security and adherence to specified Quality of Service (QoS) levels for all transmissions.

Systems leveraging InfiniBand can be structured into multiple subnets, with each subnet capable of supporting over 60,000 nodes. Within a subnet, InfiniBand switches handle layer 2 processing, while routers or bridges facilitate connectivity between subnets.

infiniband-networking-example

The second-layer processing in InfiniBand is streamlined. Each InfiniBand subnet is equipped with a subnet manager responsible for generating a 16-bit Local Identifier (LID). InfiniBand switches, comprising multiple ports, facilitate the forwarding of data packets from one port to another based on the LID contained in the Layer 2 Local Routing Header. Notably, switches primarily handle packet management and do not actively generate or consume data packets.

Leveraging its uncomplicated processing and proprietary Cut-Through technology, InfiniBand achieves a significant reduction in forwarding latency, reaching levels below 100 ns. This latency is notably faster than what traditional Ethernet switches can offer.

Within the InfiniBand network, data is transmitted in the form of packets, each with a maximum size of 4 KB, utilizing a serial approach.

InfiniBand Protocol Stack

The InfiniBand protocol embraces a structured layering approach, with each layer functioning independently and delivering services to the layer positioned above it. Please refer to the diagram below for a visual representation:

infiniband-protocol-stack

The InfiniBand protocol stack includes the physical layer, which determines how bit signals are structured into symbols on the wire, frames, data symbols, and data padding between packets. It offers precise specifications for signaling protocols, facilitating the construction of efficient packets.

Moving up the stack, the link layer defines the format of data packets and outlines protocols for essential packet operations like flow control, routing selection, encoding, and decoding.

The network layer takes charge of routing selection by appending a 40-byte Global Route Header (GRH) to the data packet, facilitating effective data forwarding.

In the forwarding process, routers execute variable CRC checks, ensuring the integrity of end-to-end data transmission.

infiniband-packet-encapsulation-format

Moving up the protocol stack, the transport layer takes charge of delivering the data packet to a designated Queue Pair (QP) and provides instructions to the QP on how to process the packet effectively.

InfiniBand's well-defined layers 1-4 collectively constitute a comprehensive network protocol, and its end-to-end flow control forms the bedrock of the network's packet transmission and reception, ensuring lossless networks.

Queue Pairs (QPs) play a pivotal role in RDMA technology. Comprising two queues – the Send Queue (SQ) and the Receive Queue (RQ) – QPs serve as the fundamental communication units. When users invoke API calls to send or receive data, they are essentially placing the data into the QP. The requests within the QP are then processed sequentially using a polling mechanism.

infiniband-qp

InfiniBand Link Rate

InfiniBand links can be established using either copper cables or fiber optic cables, with dedicated InfiniBand cables chosen based on specific connection requirements.

At the physical layer, InfiniBand defines multiple link speeds, such as 1X, 4X, and 12X, each employing a four-wire serial differential connection, with two wires in each direction.

For instance, the early SDR (Single Data Rate) specification had a 2.5 Gbps bandwidth for a 1X link, 10 Gbps for a 4X link, and 30 Gbps for a 12X link. However, due to the utilization of 8b/10b encoding, the actual data bandwidth for a 1X link was 2.0 Gbps. Considering the bidirectional nature of the link, the total bandwidth relative to the bus was 4 Gbps.

Over time, InfiniBand's network bandwidth has seen continuous upgrades, progressing from SDR, DDR, QDR, FDR, EDR, and HDR to NDR, XDR, and GDR, as depicted in the diagram below:

infiniband-roadmap

infiniband-specific-rate-encoding-method

InfiniBand's Commercial Offerings

FS.com offers a diverse product portfolio covering speeds from 40G to 800G to meet customers' varying speed requirements, including NDR, HDR, EDR, and FRD. Our product line includes InfiniBand Quantum/Quantum-2 switches, InfiniBand modules, InfiniBand adapters, as well as AOC/DAC cables supporting distances from 0.5 meters to 100 meters. These products not only support high-speed interconnects and extremely low latency but also provide scalable solutions, accelerating research, innovation, and product development for AI developers and scientific researchers.

FS-infiniband-product

In addition, we have 7 local warehouses around the world ensuring fast delivery. FS conducts rigorous performance, reliability, scenario and compatibility testing to ensure product excellence. FS.com has a professional technical team and rich experience in deploying solutions across application scenarios. We actively provide solutions for high-performance computing, data centers, education, research, biomedicine, finance, energy, autonomous driving, Internet, manufacturing, and telecommunications. Provide professional services to customers in other fields.

Conclusion

In summary, the trajectory of InfiniBand appears promising, propelled by the surging demands of high-performance computing and artificial intelligence.

Widely embraced in expansive computing clusters and supercomputers, InfiniBand stands out for its high performance and low-latency interconnect technology. It seamlessly addresses the requirements of extensive data transfers and concurrent computing by offering elevated bandwidth and reduced latency. Its adaptability to diverse topologies and intricate communication patterns positions InfiniBand uniquely, making it a formidable choice in the realms of high-performance computing and AI.

Nonetheless, Ethernet, a pervasive networking technology, remains on a trajectory of evolution. Marked by escalating speeds and technological breakthroughs, Ethernet has solidified its standing in data centers and bridged certain gaps with InfiniBand. Boasting a comprehensive ecosystem and mature standardization support, Ethernet emerges as an accessible and manageable solution in typical data center environments.

As technology advances and demands shift, both InfiniBand and Ethernet are poised to leverage their respective strengths in varied application scenarios. The ultimate winner between InfiniBand and Ethernet remains uncertain, and only time will unravel the unfolding narrative. Undoubtedly, they will persist in steering the course of information technology development, addressing escalating bandwidth needs, and furnishing adept capabilities for efficient data transmission and processing.

You might be interested in

Knowledge
Knowledge
Knowledge
See profile for Sheldon.
Sheldon
Decoding OLT, ONU, ONT, and ODN in PON Network
Mar 14, 2023
385.0k
Knowledge
See profile for Irving.
Irving
What's the Difference? Hub vs Switch vs Router
Dec 17, 2021
367.1k
Knowledge
See profile for Sheldon.
Sheldon
What Is SFP Port of Gigabit Switch?
Jan 6, 2023
334.6k
Knowledge
See profile for Migelle.
Migelle
PoE vs PoE+ vs PoE++ Switch: How to Choose?
Mar 16, 2023
419.9k
Knowledge
Knowledge
Knowledge
Knowledge
See profile for Moris.
Moris
How Much Do You Know About Power Cord Types?
Sep 29, 2021
293.7k