English

DCQCN

Updated on Jan 18, 2025 by
320

What is DCQCN?

DCQCN (Data Center Quantized Congestion Notification) is a widely used congestion control protocol, especially in the environment limited by Mellanox network cards. This protocol is based on Quantized Congestion Notification (QCN) and Data Center TCP (DCTCP). It integrates ECN (Explicit Congestion Notification) and PFC (Priority Flow Control) mechanisms to support end-to-end lossless Ethernet transmission.
DCQCN optimizes data traffic in Remote Direct Memory Access over Converged Ethernet (RoCEv2) networks. It aims to manage network congestion effectively in high-performance data center environments by minimizing packet loss, ensuring smooth traffic flow, and enhancing overall network efficiency. This proactive approach makes DCQCN suitable for latency-sensitive applications such as artificial intelligence (AI), machine learning (ML), and large-scale data analytics.

Key Components in DCQCN

DCQCN operates through three fundamental components: CP、NP、RP. These three components work together to achieve effective congestion control and traffic management to ensure efficient and reliable data transmission in the data center network.
Congestion Point (CP): The congestion point is the network element, such as a switch or router, where congestion occurs. The CP monitors buffer usage and marks packets with ECN codes when congestion thresholds are exceeded. These marked packets serve as signals to downstream devices that congestion is present.
Notification Point (NP): The notification point is the data receiver. Upon receiving ECN-marked packets, the NP generates a Congestion Notification Message (CNM) and sends it back to the sender. This message contains quantized feedback about the congestion level and guides the sender in adjusting its transmission behavior.
Reaction Point (RP): The reaction point is the data sender. It processes the CNM from the NP and dynamically adjusts its transmission rate. This reaction ensures that network traffic is reduced in congested areas, preventing severe traffic build-up and improving overall efficiency.

Key Components in DCQCN

How DCQCN Works?

DCQCN is designed to overcome the limitations when relying only on PFC to implement lossless Ethernet by combining the explicit congestion notification function of ECN with the flow control mechanism of PFC. The working principle is that when the network is congested, the transmission rate of data is reduced in time through the ECN mechanism, to control the network traffic and minimize the number of PFC triggers to avoid complete traffic stagnation. The process includes:
Congestion Detection: When the network becomes congested, switches or routers mark packets with ECN instead of dropping them, signaling a potential issue.
Feedback Generation: The receiver generates a CNM based on the ECN-marked packets and quantizes the level of congestion.
Transmission Rate Reduction: Upon receiving the CNM, the sender reduces its transmission rate proportionally to the congestion level.
Traffic Control: This adjustment minimizes the probability of Priority Flow Control (PFC) triggers, which can cause traffic to halt completely.
Continuous Monitoring: DCQCN ensures steady-state operation by continuously monitoring network conditions and maintaining optimal traffic flow.

How DCQCN Works?

Benefits of DCQCN

Improved Network Stability: By dynamically adjusting transmission rates, DCQCN prevents severe congestion and ensures steady traffic flow, even during peak loads.
Reduced Packet Loss: The use of ECN and timely congestion control minimizes packet loss, enhancing data transmission reliability.
Minimized PFC Triggers: DCQCN reduces the need for PFC, avoiding traffic pauses and ensuring continuous data movement across the network.
Low Latency and High Throughput: Its proactive congestion management ensures that latency-sensitive applications maintain high performance without delays.
Compatibility: DCQCN is compatible with existing Ethernet and ECN protocols, allowing seamless integration into current network infrastructures.
DCQCN is a key technology for optimizing network traffic in data centers. Its intelligent congestion control mechanisms improve reliability, reduce latency, and ensure efficient resource utilization, making it a cornerstone for modern high-performance computing environments.
Videos
Global Delivery Service | FS
01:11
Jun 26, 2024
354
Global Delivery Service | FS
Solutions