Innovative Solutions for Enterprise: Designing High-Performance Data Center Networks

Posted on Dec 21, 2023 by

 674

As businesses transform and information systems expand, the demand for big data and cloud resources is driving a significant increase in data center network traffic. Global data center traffic has seen a remarkable annual growth rate of about 23% from 2016 to 2021. Notably, a substantial 85% of this traffic pertains to interconnecting data centers and internal data center communication. This surge underscores the imperative for data center networks to evolve, emphasizing speed, higher capacity, and lower latency.

Evolving Data Center Network Architecture

The architecture of data center networks has progressed from the traditional core-aggregation-access model to the more modern Spine-Leaf design. This approach optimally utilizes network interconnection bandwidth, reduces multi-layer convergence ratios, and facilitates easy scalability. In the Spine-Leaf architecture, each interconnection link boasts a 100G bandwidth, and a well-calibrated network convergence ratio is devised based on business needs to manage internal traffic within and between Points of Delivery (PODs) in the data center.The three-layer underlay network in the Spine-Leaf architecture allows for the separation of core and access switches. If bottlenecks arise in traffic between the core switch and the aggregation switch or between the aggregation switch and the access switch, horizontal scaling becomes achievable by adding uplink links and reducing convergence ratios, with minimal impact on bandwidth expansion. The overlay network adopts distributed gateways through EVPN-VXLAN technology, enabling flexible and elastic network deployments and resource allocation tailored to business requirements.

high performance

Drawing on the design and deployment expertise from Internet-scale data center networks, this solution embraces the spine-leaf network architecture and leverages EVPN-VXLAN technology to realize network virtualization. This approach provides a versatile and scalable network infrastructure for upper-layer services. The data center network is categorized into production networks and office networks, segregated and safeguarded by domain firewalls. These networks connect to office buildings, laboratories, and regional center exits through network firewalls.

high performance

The core switches of the production network and the office network facilitate interconnection between PODs and link to firewall devices, offering up to 1.6Tb/s of inter-POD communication bandwidth and a high-speed network egress capacity of 160G. The internal horizontal network capacity within each POD reaches 24Tb, delivering robust support for high-performance computing clusters (CPU/GPU) and storage clusters. This ensures the overall network maintains minimal packet loss due to network performance bottlenecks.

Building cabling is meticulously planned based on the Spine-Leaf architecture. The switches within each POD are interconnected using 100G links and deployed in Top of Rack (TOR) mode. TOR groups, consisting of 2-3 cabinets, connect to Leafs through 100G links. Each POD's Leaf is strategically divided into two groups, deployed in distinct network-occupied cabinets, enhancing reliability at the cross-cabinet level within the POD. The overall network structure is streamlined, and cable deployment and management are notably efficient.

Future-Proof Equipment Selection

When envisioning and building a data center network, careful consideration of technological advancements, industrial trends, and operational costs for the next five years is crucial. This forward-looking approach aims to optimize the utilization of existing data center resources to effectively support the core business operations of the enterprise.

The choice of network switches plays a pivotal role in the overall design of the data center network. Traditional large-scale network designs often opt for chassis-based devices to enhance the overall capacity of the network system, offering limited scalability. However, this approach comes with inherent limitations and risks, including:

Limited overall capacity of chassis-based devices, falling short of meeting the escalating network scale requirements of modern data centers.
Deployment of core chassis-based devices with dual connections, resulting in a high fault radius of up to 50%, which fails to effectively guarantee business security.
Multi-chip architecture in chassis-based devices leading to significant bottlenecks in traffic processing capacity and network latency.
Complex deployment of chassis-based devices and prolonged cycles for diagnosing and troubleshooting failures, resulting in extended business interruption times during upgrades and maintenance.
Requirement for reserved slots in chassis-based devices to ensure future business expansion, contributing to increased upfront investment costs.
Constraints on later expansion, including vendor binding and weakened bargaining power, substantially escalating the cost of future scalability.

In light of these considerations, for the network equipment selection in this project, NVIDIA strongly advocates adopting a modular switch network architecture. This strategic approach unifies switches of different hierarchy levels under a single model, facilitating quick familiarization for the maintenance team. Furthermore, it provides operational flexibility for future adjustments to the network architecture, device reuse, and repair replacements.

By embracing the Spine-Leaf (CLOS) architecture in conjunction with modular switch networking, the initial network investment (Total Cost of Ownership, TCO) sees a significant reduction. The Spine-Leaf architecture ensures horizontal scalability, minimizing the impact on business operations even if a spine switch goes offline. For future expansion, additional switches and hierarchy levels can be seamlessly added based on the data center's scale requirements, expanding access capacity and backbone network switching capacity. This strategic approach allows the entire network to be procured and deployed on demand, aligning seamlessly with service, application, and business requirements.

Conclusion

In response to the ongoing trends in business transformation and the surging demand for big data, a majority of data center network designs are embracing the sophisticated Spine Leaf architecture while harnessing EVPN-VXLAN technology to realize efficient network virtualization. This architectural approach ensures the facilitation of high-bandwidth, low-latency network traffic, providing a foundation for scalability and flexibility.

Looking ahead, as the manufacturing costs of high-speed optical modules (ranging from 200G to 800G) and AOCs/DACs decrease, the evolution of data center interconnect technologies is poised to continue. FS is a professional provider of communication and high-speed networking solutions to networking, data center and telecom customers, and consistently delivers groundbreaking, efficient, and reliable products, solutions, and services, specializing in optimal solutions tailored for data centers, high-performance computing, edge computing, artificial intelligence, and various application scenarios. These solutions empower customers with enhanced business acceleration capabilities, combining low costs with outstanding performance. FS's extensive product portfolio includes NVIDIA® InfiniBand Switches, 100G/200G/400G/800G InfiniBand transceivers, and NVIDIA® InfiniBand Adapters, catering to the diverse needs of various data centers across the globe.