Ethernet Alliance Hot Take: Ethernet and InfiniBand for HPC

By Ethernet Alliance

Blog


Share

The push for ever-more powerful computers, driven by elaborate simulations, and research projects swimming in data means both Ethernet and InfiniBand are still essential to high-performance computing (HPC), though they’re evolving in their own ways. Ethernet is widely compatible yet flexible, whereas InfiniBand excels where outright performance of specific workloads is key.

These contrasts keep conversations going, specifically when trying to figure out which technology works best where. So, we checked in with members of the Ethernet Alliance to hear what they think about how Ethernet and InfiniBand fit into and are changing today’s HPC networks.

In the context of HPC, how do you view the roles, strengths and trade-offs of Ethernet and InfiniBand?

That’s a really thought provoking question, especially given how hyperscale and AI are accelerating the development of new, evolutionary solutions that may be applicable to HPC. The very word “Ethernet” speaks to its purpose – a medium designed to connect and carry everything built upon it. While it’s not the physical foundation for technologies like InfiniBand or UEC, Ethernet remains the common language of connectivity, linking systems, applications, and people across the digital landscape.

Ethernet is the ecosystem that keeps digital communication thriving. Interconnects like InfiniBand, UEC, and UALink are specialized species within that environment, each evolved to excel in specific conditions, but all contributing to the same connected world.

Rather than being competitors, Ethernet and InfiniBand are complementary technologies that often move in parallel, strengthening the other’s ecosystem.

Ethernet Alliance President & Events Chair, David J. Rodgers, EXFO 

InfiniBand’s early success was due to the original focus on high-speed interconnection of computer clusters. It was the first technology to deliver 10Gb/s links for HPC networking and offered a simple but serviceable feature set for networking late 1990’s / early 2000’s computers.

Today, InfiniBand’s strength comes from a considerable embedded base, and its limitations are due to architectural factors plus a lack of supplier diversity. Ethernet, which is the dominant solution in many networking domains other than AI and HPC, now is replacing InfiniBand in these networks for three fundamental reasons:

  1. Performance – the availability of Ethernet switch ASICs with faster switching speeds and higher radix enables the construction of larger networks with fewer tiers, lower transit times, and shorter job-completion times
  2. Economics – Ethernet benefits from having a large ecosystem with multiple suppliers of interoperable hardware, as well as a robust community of open-source software developers
  3. Reliability – In the event of network failures, Ethernet networks have demonstrated fail-over times that are an order-of-magnitude shorter than are observed when using InfiniBand.

– Ethernet Alliance Board Member, Lowell Lamb, Broadcom

As an Ethernet professional, my thoughts on InfiniBand are limited. My focus is more about how Ethernet can better serve the HPC market. That, in essence, is the crux of the question – pay for performance versus pay for need.

Companies investing in InfiniBand feel that they are willing to pay for the additional performance where Ethernet solutions address the market need of multi-vendor interoperability, as well as leveraging a generation of investment. The differences between these two can not be dismissed. In addition, one needs to consider the ecosystem that is supported by the Ethernet ecosystem, which ranges from meters to many kilometers!

– Ethernet Alliance Chief Evangelist for High-Speed Ethernet, John D’Ambrosia, Futurewei 

Latency and protocol differences of Standard Ethernet as compared to InfiniBand or PCIe do not appear to lend well to high performance computing, where low latency, guaranteed packet delivery, tight flow control, along with smaller payloads tend to be critical. Standard Ethernet is much more suited for large node connectivity and long distance. HPC needs a much shorter reach and less nodes. That said, it appears UEC is on the right path to challenge InfiniBand for AI/HPC/LLM networks. 

– Reginald Conley, Parade Technologies 

In your view, how do Ethernet and InfiniBand stack up when it comes to enabling performance, scalability, and innovation in today’s HPC environments?

I don’t see Ethernet and InfiniBand as competitors so much as technologies that coexist and complement one another. While InfiniBand initially defined its physical layer and applications and could have adopted a different physical transport medium, the Infiniband Trade Association (IBTA) and IEEE802.3 Ethernet Working Group have collaborated to incorporate complimentary physical layer signaling characteristics. The evolution of high-performance computing has given rise to several new communications protocols aimed at optimizing efficiencies and interactions, particularly to meet AI’s growing demands. Ethernet is at the core of this ongoing market evolution and is bolstered by the commitment of the IEEE 802.3 community to advancing technology in response to real-world needs.

 – Ethernet Alliance President & Events Chair, David J. Rodgers,  EXFO

This question cuts across many topics and cannot be answered briefly. However, considering the specific example of xPU networking may provide some insight into the relative scalability and trajectories of the two technologies. Today, three distinct, high-speed networks are used to interconnect xPUs: the scale-up network (~100 xPUs or fewer), scale-out network (~ 100 – 100k xPUs), and scale-across network (~100k – 1M+ xPUs). It’s telling that InfiniBand is found only in small-to-medium sized scale-out networks. Ethernet, on the other hand, is found in all three sectors, and its adoption – which began in hyperscale networks – now is expanding to smaller deployments.

– Ethernet Alliance Board Member, Lowell Lamb, Broadcom

In my opinion, I focus on Ethernet meeting the needs of the HPC environment – not on the choice of Ethernet vs InfiniBand. The two solutions are vastly different; Ethernet leverages an ECO-system approach, while InfiniBand targets a performance optimized-centric approach. Note: the performance focused approach limits the multi-vendor interoperability approach, which encourages interoperability and competition. The Ethernet community’s commitment to multi-vendor interoperability is well established!

– Ethernet Alliance Chief Evangelist for High-Speed Ethernet, John D’Ambrosia, Futurewei

Both will be needed and certainly Standard Ethernet will always have a presence for Front-end and typical TOR applications. On the back-end, changes in Ethernet (UEC) could see/increase the role of Ethernet in the low latency space also.

– Reginald Conley, Parade Technologies

Share

Subscribe to our blog

Subscribe to receive the latest insights and updates from the Ethernet Alliance.

No, thanks!

*

*

*