A Quick Look at the Differences: RDMA vs TCP/IP
As we all know, a network protocol is a set of rules that govern data transmission. Remote Direct Memory Access (RDMA) and TCP/IP networking protocols are commonly used in distributed storage networks. Both RoCE and Infiniband are RDMA technologies, so what is the difference between them and TCP/IP? What are the differences between RoCE and Infiniband RDMA? This article explains RDMA vs TCP/IP in detail.
RDMA vs TCP/IP: What Are They?
What Is RDMA?
Remote Direct Memory Access is a technology that enables two networked computers to exchange data in main memory without relying on the processor, cache, or operating system of either computer. Like locally-based Direct Memory Access (DMA), RDMA improves throughput and performance because it frees up resources, resulting in faster data transfer rates and lower latency between RDMA-enabled systems. RDMA can benefit both networking and storage applications.
There are three RDMA options: Infiniband, RDMA over Converged Ethernet (RoCE), and iWARP. InfiniBand (IB) is a network specially designed for RDMA, which has extremely high throughput and extremely low latency. iWARP is a TCP-based RDMA network that uses TCP to achieve reliable transmission. RoCE (RDMA over Converged Ethernet) is a network protocol that allows data to be transferred from one machine to another, reducing the operating load on the CPU.
What Is TCP/IP?
TCP/TP, or Transmission Control Protocol/Internet Protocol, is used to interconnect network devices over the Internet. It identifies how data should be packetized, addressed, transmitted, routed, and received. TCP/IP places lots of emphasis on accurate data transmission between two computers. If the system encounters some problem while sending the message in one go, the entire message must be sent again.
In addition, the functionality of TCP/IP divides into four different layers: datalink layer, internet layer, transport layer, and application layer. Data must go through these four layers before being received at the other end. Then, TCP/IP will reassemble the data by passing the layers in the opposite order and present it to the receivers. In this way, you can improve the performance or security of data centers by upgrading certain layers rather than the whole system.
Network Protocol Evolution: From TCP/IP to RDMA
For applications with high I/O concurrency and low latency, such as high-performance computing and big data analysis, the existing TCP/IP software and hardware architecture cannot meet the application requirements. The traditional TCP/IP network communication uses the kernel to send messages. This communication mode has high data movement and data replication overheads. The RDMA technology is developed to solve the data processing latency on the server side during network transmission. As shown in the figure below, the RDMA technology can access memory data through a network port without an operating system kernel. This allows high-throughput, low-latency network communication, especially for large-scale parallel computer clusters.
Finding the Difference Between RDMA and TCP/IP
As mentioned above, network protocols have evolved in many ways. The four networks also have their own advantages and disadvantages, and they can be selected as appropriate when facing different application scenarios. The following table lists the several differences between RoCE, Infiniband, iWARP, and TCP/IP.
|High scalability||Good||Excellent||Wore than InfiniBand and RoCE||Poor|
|High performance||Equivalent to InfiniBand||Excellent||Slightly worse than InfiniBand (affected by TCP)||Poor|
|Easy management||Hard||Hard||Harder than RoCE||Easy|
|Network device||Network device||IB switch||Network device||Network device|
High scalability: All these three RDMA network protocols have high scalability and flexibility, with Infiniband being the most scalable. A single subnet of Infiniband can support tens of thousands of nodes. Besides, it also provides relatively simple and scalable architecture, creating almost unlimited cluster sizes through Infiniband routers. However, the design of TCP/IP focuses on reliability rather than low latency and high throughput, which limits its scalability in application scenarios where high performance is required.
High performance: Since TCP/IP burdens CPU processing resources and latency, it performs worsest compared to the other network protocols. And RoCE increases speed and power in enterprise data centers while reducing the total cost of ownership without replacing Ethernet infrastructure. As for Infiniband, it uses serial links and buses to send data one bit at a time, enabling faster, more efficient communications. iWARP provides low latency and high throughput data transfer capabilities similar to RoCE, but its performance is slightly inferior compared to InfiniBand and RoCE.
High stability: TCP/IP is a stable protocol widely used on the Internet, benefiting from extensive testing and validation, with strong fault tolerance and network recovery mechanisms. InfiniBand networks, commonly found in large-scale supercomputers, have proven stability. RoCE is relatively new, lacking the extensive adoption and validation of InfiniBand and TCP/IP, potentially affecting its stability. iWARP may have stability challenges due to additional equipment requirements and compatibility issues.
Easy management: Although RoCE, Infiniband, and iWARP have lower latency and higher performance than TCP/IP, the latter is easier to deploy and manage. The network administrators who use TCP/IP to build up devices and network connectivity only need little central management.
Cost-efficiency: For enterprise data centers with a limited budget, Infiniband is probably not a good choice. It uses expensive IB switching ports to carry a large number of applications, increasing the computing cost, maintenance cost, and management cost of enterprises. In contrast, RoCE and TCP/IP using ethernet switches are more cost-effective. Therefore, Infiniband switches are more popular in HPC data centers.
Network devices: As the table shows above, both RoCE, iWARP, and TCP/IP achieve data transmission through ethernet switches, while Infiniband uses IB switches with independent architecture to carry applications. Typically, IB switches must be interconnected with devices that support IB protocol and are relatively closed and difficult to replace.
''Also Check- Why HPC Data Centers Need InfiniBand Interconnection
Which One Is the Best for Data Center?
Nowadays, data center networks require maximum bandwidth and extremely low latency from the underlying interconnect. Under these circumstances, the traditional TCP/IP network protocol cannot keep pace with data center requirements since they burden CPU processing resources and high latency. For those enterprises deciding between RoCE and Infiniband RDMA, their unique requirements and costs should be taken into consideration. If they prefer the most high-performance network connection, Infiniband is much better. For those looking for higher performance, easier management and limited cost, RoCE must be a nice choice.