Arista gives answers to the pressure AI puts on networks

If networks are to realize the full power of AI, they need to combine high-performance connectivity with no packet loss

Martin Hull, vice president of product management for Cloud Titans and Platforms at Arista Networks, said the concern is that today’s traditional network interconnects cannot provide the scale and bandwidth needed to satisfy AI requests. Historically, the only options for connecting processor cores and memory have been proprietary interconnects such as InfiniBand, PCI Express, and other protocols for connecting compute clusters and offloading them, but in most cases they are not suitable for AI and its work load requirements.

Arista Artificial Intelligence Spine

To address these issues, Arista is developing a technology called AI Spine, which requires data center switches with deep packet buffers and network software that provides real-time monitoring to help manage buffers and control traffic efficiently.

“What we’re starting to see is a wave of AI, natural language, machine learning-based applications that involve huge amounts of data distributed across hundreds or thousands of processors (CPUs, GPUs), all of which are burdened with Compute the task, slice it into pieces, each process its own piece, and then send it back,” Hull said.

“If your network errs on the side of a traffic drop, that means the AI ​​workload is delayed in starting because you have to retransmit it. If in the process of processing those AI workloads, the traffic is moving back and forth again, that slows down the AI The speed at which they work, and they may actually fail.”

AI Spine Architecture

Arista’s AI Spine is based on its 7800R3 series of data center switches, which support 460Tbps of switching capacity and hundreds of 40Gbps, 50Gbps, 100Gbps or 400Gbps interfaces and 384GB deep buffers at the high end.

“The depth buffer is key to keeping the flow flowing and not losing anything,” Hull said. “Some people are concerned about latency in large buffers, but our analysis doesn’t show that’s happening here.”

The AI ​​Spine system will be controlled by Arista’s core networking software, Extensible Operating System (EOS), which supports high-bandwidth, lossless, low-latency, Ethernet-based networks that can interconnect thousands of GPUs at 100Gbps, 400Gbps and 100Gbps speeds According to AI Spine’s white paper, 800Gbps and buffer allocation scheme.

To help support this, switches and EOS packages create a fabric that breaks down packets and reformats them into uniformly sized units, “spraying” them evenly throughout the fabric, according to Arista. The purpose is to ensure equal access to all available paths within the fabric and zero packet loss.

“Cell-based architecture doesn’t care about front-panel connection speeds, and mixing and matching 100G, 200G, and 400G requires little concern,” Arista wrote. “In addition, the cell structure makes it immune to the ‘flow collision’ problem of Ethernet fabrics. A distributed scheduling mechanism is used within the switch to ensure fairness for traffic competing for access to congested output ports.”

Since each flow uses any available path to reach its destination, the structure is well-suited for handling the high-traffic “elephant flows” common to AI/ML applications, so “there are no internal hotspots in the network,” Arista writes.

Artificial Intelligence Spine Model

To explain how AI Spine works, Arista’s white paper provides two examples.

First, the Arista 7800’s dedicated leaf and spine design connects to approximately hundreds of server racks, and EOS’s intelligent load balancing capabilities will control traffic between servers to avoid conflicts.

QoS classification, Explicit Congestion Notification (ECN), and Priority Flow Control (PFC) thresholds are configured on all switches to avoid packet loss. Arista EOS’s Latency Analyzer (LANZ) determines appropriate thresholds to avoid packet loss while maintaining high throughput, and allows network scaling while keeping latency predictable and low.

The second use case can scale to hundreds of endpoints, connecting all GPU modes directly to the 7800R3 switch in AI Spine. The result is a fabric that provides a single hop between all endpoints, reducing latency and enabling a single, large, lossless network that requires no configuration or tuning, Arista writes.

Challenges of Networked Artificial Intelligence

The demand for AI Spine architecture is mainly driven by technologies and applications such as server virtualization, application containerization, multi-cloud computing, Web 2.0, big data and HPC. “To optimize and improve the performance of these new technologies, distributed scale-out, deeply buffered IP fabrics have been proven to deliver consistent performance that can scale to support extreme ‘east-west’ traffic patterns,” Arista wrote.

While it may be premature for most enterprises to worry about handling large-scale AI cluster workloads, some larger environments, as well as hyperscale, financial, virtual reality, gaming, and automotive development networks, are already preparing for the traffic disruptions they may cause. . traditional network.

Arista CEO Jayshree Ullal recently told Goldman Sachs that as AI workloads grow, they put increasing pressure on the scale and bandwidth of networks, as well as on the right storage and buffer depth, with predictable latency. and processing elephant flows of small packets exerts increasing pressure. Technology convergence. “It will take a lot of engineering to get legacy Ethernet running as a back-end network to support this technology in the future, and the growing use of 400G will add extra momentum to this development,” Ullal said.

Contatto