Advancements in DPU Technology: Empowering Future Innovations

Posted on Jan 19, 2024 by

 683

With the evolution of cloud computing and virtualization technologies, network cards have also undergone four stages regarding functionality and hardware structure, NICs, SmartNIC, FPGA-Based DPU, and DPU SoC NIC. In this article, we will explain the different types of network cards and processors used in data centers, focusing on their hardware, programmability capabilities, development, and application.

The Evolution and Application of Network Interface Controllers (NICs)

The traditional basic network card, also known as NIC or network adapter, plays a vital role in computer networks. Its main function is to convert data for efficient transmission among network devices. Over time, advancements have expanded its capabilities. It now incorporates additional features and possesses basic hardware offloading capabilities, such as CRC Check, TSO/UF0, LSO/LR0, and VLAN support, among others. It also supports SR-IOV for virtualization and QoS for improved network performance. Regarding network interface bandwidth, it has evolved from 100M and 1000M speeds to support higher speeds of 10G, 25G, and even 100G.

Network Interface Controllers (NICs)

In cloud computing virtualization networks, the traditional basic network card offers three primary methods for providing network access to virtual machines.

1. Via the operating system kernel protocol stack, the network card forwards incoming traffic to virtual machines.

2. The DPDK user-mode driver bypasses the kernel protocol stack, directly copying data packets to the virtual machine's memory for improved performance.

3. SR-IOV technology virtualizes the physical network card into multiple virtual functions (VFs) assigned directly to virtual machines.

As network complexity grows with tunnel protocols like VxLAN and virtual switching technologies, CPU resources become more demanding. SmartNICs address this challenge by offloading network processing tasks from the CPU, enhancing overall network performance.

The Evolution and Application of SmartNIC

SmartNICs offer more than just network transmission capabilities found in traditional basic network cards. They incorporate data plane hardware offloading capabilities, such as OVS/vRouter hardware offloading, using FPGA or integrated processor with FPGA and processor core. These SmartNICs enhance the forwarding rate of cloud computing networks and alleviate the computing resources burden on the host CPU.

Unlike traditional network cards, SmartNICs do not include a general-purpose CPU. Instead, they rely on the host CPU to manage the control plane. The primary focus of SmartNIC offloading acceleration is the data plane, encompassing tasks such as fast-path offloading for virtual switches like 0VS/vRouter, RDMA network offloading, NVMe-oF storage offloading, and IPsec/TLS data plane security offloading.

SmartNIC

However, despite these advancements, as network speeds continue to rise in cloud computing applications, the host CPU still dedicates considerable resources to traffic classification, tracking, and control. Achieving "zero consumption" of the host CPU has become the next research direction for cloud vendors, aiming to minimize the host CPU's involvement in these tasks.

The Evolution and Application of FPGA-Based DPU

FPGA-Based DPU is a smart network card that can offload data and have plane-controlling functions. It is also partially programmable for both the control and data planes. As for the hardware, it includes a general-purpose CPU processor based on FPGA, such as Intel CPU.

In comparison to SmartNICs, FPGA-Based DPUs enhance the hardware architecture by incorporating a general-purpose CPU processing unit, resulting in an FPGA+CPU architecture. This configuration facilitates the acceleration and offloading of various infrastructure components, including network, storage, security, and management. Currently, the predominant form of DPUs is the FPGA+CPU configuration. DPUs based on this architecture offer excellent software and hardware programmability.

FPGA-Based DPU

During the early stages of DPU development, most manufacturers opted for this approach. It offered shorter development times, and rapid iterations, and facilitated the swift customization of functions. This allowed DPU manufacturers to quickly introduce products and seize market opportunities. However, as network bandwidth transitioned from 25G to 100G, the FPGA+CPU DPU architecture encountered limitations due to chip processes and FPGA structures. These limitations made it challenging to effectively control chip area and power consumption while pursuing higher throughput. Consequently, the continuous development of this DPU architecture was hindered.

The Evolution and Application of DPU SoC NIC

DPU SoC, based on ASIC (Application-Specific Integrated Circuit), combines the performance of dedicated accelerators with the programmability of general-purpose processors. Unlike FPGA-based architectures, DPU SoCs address challenges in cost, power consumption, and functionality, especially for next-generation 100G servers. They offer advantages in cost, power consumption, high throughput, and flexible programming capabilities. DPU SoCs support application management, virtual machines, containers, and bare metal applications.

DPU SoC NIC

DPU technology is advancing, and general-purpose programmable DPU SoCs are now crucial in cloud vendors' data center construction. They enable efficient management of computing and network resources, support diverse cloud computing scenarios, and optimize data center resource utilization. Chip giants and leading cloud service providers have made significant investments in the research, development, and utilization of DPUs, achieving notable cost-effectiveness through continuous exploration and practical implementation.

DPU in AWS (Amazon Cloud)

AWS (Amazon Web Services), a top cloud computing service provider, relies on the Nitro DPU system as a crucial technical foundation. The Nitro DPU system efficiently offloads network, storage, security, and monitoring functions to dedicated hardware and software. This enables service instances to access nearly all server resources, leading to significant cost reductions and increased annual revenue. The Nitro DPU system comprises multiple components：

1. Nitro card: Dedicated hardware for network, storage, and control to enhance overall system performance.

2. Nitro security chip: Transfers virtualization and security functions to dedicated hardware and software, reducing the attack surface and ensuring a secure cloud platform.

3. Nitro hypervisor: A lightweight hypervisor management program that efficiently manages memory and CPU allocation, providing performance comparable to bare metal.

DPU in AWS (Amazon Cloud)

By providing key network, security, server, and monitoring functions, the Nitro DPU system frees up underlying service resources for customer virtual machines. It enables AWS to offer more bare metal instance types and even achieve network performance of up to 100Gbps for specific instances.

NVIDIA DPU

NVIDIA, a prominent semiconductor company renowned for its graphics processing units (GPUs) in AI and high-performance computing (HPC), acquired Mellanox, a network chip and device company, in April 2020 for $6.9 billion. Following the acquisition, NVIDIA introduced the BlueField series of DPUs.

The NVIDIA BlueField-3 DPU, designed specifically for AI and accelerated computing, inherits the advanced features of the BlueField-2 DPU. It provides up to 400G network connectivity and offers offloading, acceleration, and isolation capabilities for software-defined networking, storage, security, and management functions.

Intel IPU

Intel IPU (Infrastructure Processing Unit) is an advanced network device equipped with hardened accelerators and Ethernet connections. It utilizes tightly coupled dedicated programmable cores to accelerate and manage infrastructure functions. IPU enables complete infrastructure offload and acts as a host control point for running infrastructure applications, providing an additional layer of security. Offloading all infrastructure services from the server to the IPU frees up server CPU resources and offers cloud service providers an independent and secure control point.

Intel IPU

Intel's roadmap includes the Oak Springs Canyon and Mount Evans IPU products. Oak Springs Canyon is an FPGA-based IPU product, while Mount Evans IPU is an ASIC-based IPU product. Oak Springs Canyon features an Intel Agilex FPGA and Xeon-D CPU, while Mount Evans, jointly designed by Intel and Google, incorporates ASIC for packet processing and 16 ARM Neoverse N1 cores for powerful computing capabilities.

DPU in Alibaba Cloud

Alibaba Cloud is at the forefront of DPU technology exploration. During the Alibaba Cloud Summit in 2022, they unveiled the cloud infrastructure processor CIPU, developed on the Shenlong architecture. CIPU inherits the functionality and positioning of its predecessor, the MoC card (Micro Server on a Card), which aligns with the DPU definition. The MoC card boasts independent I/O, storage, and processing units and handles network, storage, and device virtualization tasks. MoC Cards have gone through four stages of development:

- The first and second generations of MoC cards addressed the challenge of computing virtualization with zero overhead, with networking and storage virtualization implemented in software.

- The third generation of MoC cards introduced enhanced network forwarding functions, significantly improving network performance.

- The fourth generation of MoC cards achieved complete hardware offloading of networking and storage operations and also supports RDMA capability.

Alibaba Cloud's CIPU, designed for the Feitian system, is crucial for constructing a new generation of comprehensive software and hardware cloud computing architecture systems.

DPU in Volcano Engine

Volcano Engine is dedicated to advancing self-developed DPU technology, using an integrated soft and hard virtualization approach for elastic and scalable high-performance computing services. Their second-generation elastic bare metal server and third-generation cloud server both feature their self-developed DPUs. These DPUs have undergone extensive testing to ensure their capabilities and suitability for various applications. The second-generation EBM instance, launched in 2022, combines the stability and security of physical machines with the flexibility of virtual machines, representing a new generation of high-performance cloud servers. The third-generation ECS instance, released in the first half of 2023, integrates Volcano Engine's latest DPU architecture with its proprietary virtual switch and virtualization technology, significantly improving network and storage I/O performance. By combining their self-developed DPU, virtual switch, and virtualization technology, Volcano Engine aims to offer scalable and efficient high-performance computing solutions that meet the evolving demands of cloud computing.