AMD puts out new slottable GPU for AI-curious enterprises

The Register / 5/7/2026

📰 NewsDeveloper Stack & InfrastructureIndustry & Market MovesModels & Research

Read original →

共有:

Key Points

AMD has introduced the MI350P, a slottable PCIe-based Instinct GPU aimed at enterprise customers exploring AI workloads.
The MI350P is built around 144GB of HBM3e memory and is rated at up to 4.6 teraFLOPS for FP4 compute.
AMD positions the dual-slot form factor as a practical way for enterprises to add AI acceleration without adopting fully data-center-style GPU platforms.
The announcement signals AMD’s continued focus on expanding AI hardware options tailored to enterprise deployment needs.

Systems

AMD puts out new slottable GPU for AI-curious enterprises

MI350P packs 144 GB of HBM3e and up to 4.6 teraFLOPS of FP4 grunt into a dual slot card

Tobias Mann Tobias Mann Systems Editor

Published thu 7 May 2026 // 14:00 UTC

AMD hopes to win over enterprise AI customers with a more affordable datacenter GPU that can drop into conventional air-cooled servers.

Announced on Thursday, the MI350P is the House of Zen’s first PCIe-based Instinct accelerator since the MI210 debuted all the way back in 2022.

Until now, AMD’s best GPUs have only been available in packs of eight and used socketed OAM modules that weren’t compatible with most server platforms.

REG AD

By comparison, The MI350P can slot into just about any 19-inch pizza box design that offers enough power and airflow, making it a much easier sell for enterprises dipping their toes into on-prem AI for the first time.

REG AD

The 600-watt, dual-slot card is essentially a MI350X that’s been cut in half. That means the CNDA-based GPU is packing 4.6 petaFLOPS of FP4 compute and 144 GB of VRAM spread across four HBM3e stacks delivering a respectable 4 TB/s of memory bandwidth.

AMD supports configurations ranging from one to eight MI350Ps, though a lack of high-speed interconnects on these cards means it’ll be limited to PCIe 5.0 speeds (128 GB/s) for chip-to-chip communications, potentially limiting its potential in larger models.

AMD hasn’t shared pricing for the cards just yet, but at least on paper, the MI350P is well positioned to compete with either Nvidia’s H200 NVL or RTX Pro 6000 Blackwell PCIe cards.

Compared to the 141 GB H200, the MI350P promises about 38 percent higher peak performance at FP8, while eking out a narrow VRAM capacity advantage.

But the H200 does pull ahead when it comes to memory bandwidth. With six HBM3e stacks to the MI350P’s four, the nearly two-year-old card’s memory is still about 20 percent faster.

Nvidia's H200 also supports high-speed chip-to-chip communications over NVLink, while the MI350P doesn’t use AMD’s equivalent Infinity Fabric interconnect.

However, all this assumes you can still find H200 NVLs in the wild.

Since last summer, Nvidia has been pushing its RTX Pro 6000 Server cards on enterprise customers. As of writing, the card is Nvidia’s most powerful Blackwell-based accelerator offered in a PCIe formfactor.

REG AD

Compared to the RTX Pro 6000, the MI350P’s price starts becoming a bigger factor than performance. Workstation versions of the RTX Pro, which ditch the passive cooler for an active one, routinely sell for between $8,000 to $10,000 apiece, making it one of Nvidia’s more affordable datacenter-class GPUs.

Depending on how pricing shakes out, AMD may have to push hard to be competitive.

Having said that, the MI350P is still the better-specced part, delivering 2.3x higher peak flops, 2.5x the memory bandwidth, and 50 percent more vRAN of the RTX Pro.

	AMD MI350P	Nvidia H200 NVL	Nvidia RTX Pro 6000 Server
BF16	1,150 TFLOPS	836 TFLOPS	500 TFLOPS
FP16	1,150 TFLOPS	836 TFLOPS	500 TFLOPS
FP8	2,300 TFLOPS	1,671 TFLOPS	1,000 TFLOPS
MXFP8	2,300 TFLOPS	-	1,000 TFLOPS
MXFP4	4,600 TFLOPS	-	2,000 TFLOPS
Memory Capacity	144 GB HBM3E	141 GB HBM3e	96 GB GDDR7
Memory BW	4.0 TB/s	4.8 TB/s	1.6 TB/s
GPU Instances	Up to 4 @ 36GB each	Up to 7 @ 16.5 GB each	Up to 4 @ 24 GB each
GPU Scale-up Interconnect	Not supported	2- or 4-way NVLink bridge at 900 GB/s per GPU	Not supported
Product FF	FHFL dual-slot Air-cooled	FHFL dual-slot air-cooled	FHFL dual-slot air-cooled
Max Total Board Power (TBP)	600W (450W configurable)	600W (Configurable)	600W (Configurable)
PCIe Host	x16 PCIe Gen 5 at 128GB/s	x16 PCIe Gen 5 at 128GB/s	x16 PCIe Gen 5 at 128GB/s

Now, this all assumes peak FLOPS and memory bandwidth, which is rarely realistic. The tensors used by AI workloads are rarely the ideal shape for squeezing the maximum number of FLOPS out of a chip. This is why we run for Maximum Achievable MatMul FLOPS (MAMF) and Babel Stream memory bandwidth benchmarks as part of our AI test suite.

AMD seems to understand that peak FLOPS don’t really translate cleanly into real-world performance, and in the marketing materials shared with El Reg prior to publication, compared the MI350P’s theoretical performance against its real-world delivered performance.

MI350P	Delivered (TFLOPS)	Peak (TFLOPS)
BF16	713	1150
FP16	672	1150
FP8	1529	2300
MXFP8	1327	2300
MXFP6	1804	4600
MXFP4	2299	4600
Memory Capacity	144 GB HBM3E	144 GB HBM3E
Memory BW	3.6 TB/s	4.0 TB/s

It’d be nice to see Nvidia and others adopt similar practices regarding accelerator performance claims, though we suspect getting everyone to agree on the best way to measure this might not be easy.

MORE CONTEXT

The MI350P’s launch comes as AMD prepares to address a very different and likely more lucrative segment with its first rack-scale compute platform, codenamed Helios.

That system is due out in the second half of the year, and is aimed primarily at large hyperscale and neocloud deployments. The system packs 72 of its all-new MI455X GPUs into a single double-wide OCP rack that behaves like an enormous accelerator.

REG AD

The platform will be AMD’s first crack at Nvidia’s NVL72 racks, which launched alongside its Blackwell generation nearly two years ago. ®

ai and ml ai amd enterprise datacenter nvidia

Black Hat USA

AI Business

MCP Sentinel v1.0 Is Out: A Lockfile for MCP Tool Schemas

Dev.to

Share of Model: The Metric That Replaces Domain Authority in 2026

Dev.to

Preserving Color in Neural Artistic Style Transfer

Dev.to

I Built an AI Video Factory That Runs 24/7 — Fully Open Source

Dev.to

AMD puts out new slottable GPU for AI-curious enterprises

Key Points

AMD puts out new slottable GPU for AI-curious enterprises

MORE CONTEXT

60 years since humanity touched the surface of another planet

Oracle and OpenAI's Texas Stargate datacenter expansion reportedly on the skids

Don’t blame AI yet for poor jobs numbers, analysts say

US state laws push age checks into the operating system

Related Articles

Black Hat USA

MCP Sentinel v1.0 Is Out: A Lockfile for MCP Tool Schemas

Share of Model: The Metric That Replaces Domain Authority in 2026

Preserving Color in Neural Artistic Style Transfer

I Built an AI Video Factory That Runs 24/7 — Fully Open Source

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer