400G Data Center Deployment Challenges and Solutions

Updated on Jun 16, 2022 by

 7.4k

As technology advances, specific industry applications such as video streaming, AI, and data analytics are increasingly pushing for increased data speeds and massive bandwidth demands. 400G technology, with its next-gen optical transceivers, brings a new user experience with innovative services that allow for faster and more data processing at a time.

Large data centers and enterprises struggling with data traffic issues embrace 400G solutions to improve operational workflows and ensure better economics. Below is a quick overview of the rise of 400G, the challenges of deploying this technology, and the possible solutions.

The Rise of 400G Data Centers

The rapid transition to 400G in several data centers is changing how networks are designed and built. Some of the key drivers of this next-gen technology are cloud computing, video streaming, AI, and 5G, which have driven the demand for high-speed, high-bandwidth, and highly scalable solutions. The large amount of data generated by smart devices, the Internet of Things, social media, and other As-a-Service models are also accelerating this 400G transformation.

The major benefits of upgrading to a 400G data center are the increased data capacity and network capabilities required for high-end deployments. This technology also delivers more power, efficiency, speed, and cost savings. A single 400G port is considerably cheaper than four individual 100G ports. Similarly, the increased data speeds allow for convenient scale-up and scale-out by providing high-density, reliable, and low-cost-per-bit deployments.

How 400G Works

Before we look at the deployment challenges and solutions, let’s first understand how 400G works. First, the actual line rate or data transmission speed of a 400G Ethernet link is 425 Gbps. The extra 25 bits establish a forward error connection (FEC) procedure, which detects and corrects transmission errors.

400G adopts the 4-level pulse amplitude modulation (PAM4) to combine higher signal and baud rates. This increases the data rates four-fold over the current Non-Return to Zero (NRZ) signaling. With PAM4, operators can implement four lanes of 100G or eight lanes of 50G for different form factors (i.e., OSFP and QSFP-DD). This optical transceiver architecture supports transmission of up to 400 Gbit/s over either parallel fibers or multiwavelength.

PM4

Deployment Challenges & Solutions

Interoperability Between Devices

The PAM4 signaling introduced with 400G deployments creates interoperability issues between the 400G ports and legacy networking gear. That is, the existing NRZ switch ports and transceivers aren’t interoperable with PAM4. This challenge is widely experienced when deploying network breakout connections between servers, storage, and other appliances in the network.

400G transceiver transmits and receives with 4 lanes of 100G or 8 lanes of 50G with PAM4 signaling on both the electrical and optical interfaces. However, the legacy 100G transceivers are designed on 4 lanes of 25G NRZ signaling on the electrical and optical sides. These two are simply not interoperable and call for a transceiver-based solution.

One such solution is the 100G transceivers that support 100G PAM4 on the optical side and 4X25G NRZ on the electrical side. This transceiver performs the re-timing between the NRZ and PAM4 modulation within the transceiver gearbox. Examples of these transceivers are the QSFP28 DR and FR, which are fully interoperable with legacy 100G network gear, and QSFP-DD DR4 & DR4+ breakout transceivers. The latter are parallel series modules that accept an MPO-12 connector with breakouts to LC connectors to interface FR or DR transceivers.

NRZ & PM4

Excessive Link Flaps

Link flaps are faults that occur during data transmission due to a series of errors or failures on the optical connection. When this occurs, both transceivers must perform auto-negotiation and link training (AN-LT) before data can flow again. If link flaps frequently occur, i.e., several times per minute, it can negatively affect throughput.

And while link flaps are rare with mature optical technologies, they still occur and are often caused by configuration errors, a bad cable, or defective transceivers. With 400GbE, link flaps may occur due to heat and design issues with transceiver modules or switches. Properly selecting transceivers, switches, and cables can help solve this link flaps problem.

Transceiver Reliability

Some optical transceiver manufacturers face challenges staying within the devices’ power budget. This results in heat issues, which causes fiber alignment challenges, packet loss, and optical distortions. Transceiver reliability problems often occur when old QSFP transceiver form factors designed for 40GbE are used at 400GbE.

Similar challenges are also witnessed with newer modules used in 400GbE systems, such as the QSFP-DD and CFP8 form factors. A solution is to stress test transceivers before deploying them in highly demanding environments. It’s also advisable to prioritize transceiver design during the selection process.

Deploying 400G in Your Data Center

Keeping pace with the ever-increasing number of devices, users, and applications in a network calls for a faster, high-capacity, and more scalable data infrastructure. 400G meets these demands and is the optimal solution for data centers and large enterprises facing network capacity and efficiency issues. The successful deployment of 400G technology in your data center or organization depends on how well you have articulated your data and networking needs.

Upgrading your network infrastructure can help relieve bottlenecks from speed and bandwidth challenges to cost constraints. However, making the most of your network upgrades depends on the deployment procedures and processes. This could mean solving the common challenges and seeking help whenever necessary.

A rule of thumb is to enlist the professional help of an IT expert who will guide you through the 400G upgrade process. The IT expert will help you choose the best transceivers, cables, routers, and switches to use and even conduct a thorough risk analysis on your entire network. That way, you’ll upgrade appropriately based on your network needs and client demands.