Astera speaks softly and carries a big switch

The Register / 5/6/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIndustry & Market Moves

Key Points

  • The article discusses Astera’s approach to enabling high-speed connectivity for AI/ML systems without the typical constraints and complexity associated with NVLink-style setups.
  • It frames Astera’s value as simplifying interconnect design, deployment, and operations while still supporting demanding bandwidth requirements.
  • The piece emphasizes that the “big switch” concept is meant to reduce friction in building scalable, high-performance compute networks for modern accelerators.
  • It highlights the broader industry push to move beyond point-to-point and tightly coupled interconnect assumptions as AI clusters scale.

AI + ML

Astera speaks softly and carries a big switch

High-speed connectivity without NVLink baggage

Tobias Mann Tobias Mann
Published

Astera Labs unveiled an alternative to Nvidia's NVSwitch for building rack-scale AI systems on Tuesday, claiming it will work with nearly any accelerator.

The AI fabric switch, codenamed Scorpio X, crams 320 lanes of PCIe 6.0 connectivity into a single ASIC with 5.12 TB/s of bidirectional bandwidth.

Historically, PCIe switches have been used in a variety of applications including scale-out compute fabrics. CPUs alone either didn't offer enough or fast enough lanes for all the GPUs, NICs, and storage required. So, rather than hanging everything off the CPU, a PCIe switch, often built into the NIC, was used to connect everything together.

REG AD

Astera contends that with a big enough switch, PCIe is a viable alternative to interconnects like NVLink, in the scale-up fabrics used to make dozens or more GPUs behave more like a single large one without needing to redesign their accelerators.

REG AD

However, Astera hasn't just built a bigger PCIe switch. Scorpio is equipped with many of the same in-network compute capabilities as Nvidia's NVSwitch, which help to accelerate collective communications.

These communications are especially important for generative AI inference. Large language models have become rather chatty from a network standpoint as mixture-of-experts (MoE) architectures have caught on.

MoE models are composed of multiple sub-models called experts. For each token generated, a different selection of experts, potentially running on different GPUs, may be used. 

By moving collective communications to the switch, the GPUs spend less time waiting for the network to catch up and more time churning out tokens.

Astera has gone so far as to develop a multicast operation optimized for MoE inference that it calls Hypercast.

"One of the limitations of the standard multicast is the number of groups you can actually support, as well as the dynamic nature of needing to change those groups on the fly for mixture-of-experts models," Ahmad Danesh, AVP of product management at Astera, told El Reg.

Where Scorpio fits in the scale-up ecosystem

While there are clear benefits to using PCIe as a chip-to-chip interconnect, Scorpio isn't exactly a replacement for Nvidia's NVSwitch chips. NVSwitch 6, announced at CES in January, offers nearly 3x the bandwidth at 14.4 TB/s.

REG AD

However, Astera doesn't need to compete with NVSwitch directly. In fact, Astera announced plans to extend support for NVLink Fusion, Nvidia's attempt to open its high-speed interconnect to the broader ecosystem, last spring.

Instead, Scorpio is being positioned more as a vendor agnostic alternative. Technologies like NVLink Fusion or the emerging UALink protocol are gaining traction, but chips need to be designed around them.

PCIe works with just about anything because it's already used to get data in and out of the accelerators. For example, if you wanted to stitch together 32 or more Nvidia RTX Pro 6000 Server cards, you'd need a PCIe switch, since those GPUs don't support NVLink at all. 

PCIe also makes it easier to mix and match chips for disaggregated inference architectures, like we've seen with Nvidia and Groq, AWS and Cerebras, or Intel and SambaNova.

These architectures involve using one accelerator for compute heavy prefill operations and another for bandwidth intensive decode operations. For this to work, the chips have to be connected to one another. Many AI chip builders are doing this over Ethernet, but PCIe would be more direct.

Alongside its Scorpio X family of chips, Astera is also expanding its Scorpio P-series switches with models ranging from 32 to 320 lanes of PCIe connectivity.

All of these switches work with its COSMOS management suite, a hardware monitoring platform designed to help track down and resolve issues across the network fabric. 

Astera's refreshed Scorpio switches are currently sampling with production expected to ramp in the second half of 2026. ®