South Korean AI chip startup Rebellions eyes new shores for rack-scale invasion
Funding round comes ahead of planned IPO
GPU-makers like Nvidia and AMD may dominate the AI infrastructure market, but there are still more than a few AI chip startups knocking around.
One of them is Rebellions, which after establishing a foothold on its home turf in South Korea, aims to bring its tech to the rest of the world, beginning with a new rack-scale compute platform that won't require enterprises to adopt liquid cooling or ultra-power dense racks.
Founded in late 2020, the startup produces AI accelerators that have been deployed in numerous applications in the South Korean domestic market.
Initially, "we focused a great deal on telcos, service providers, and enterprise-end users within the Korean market," Rebellions chief business officer Marshall Choy told El Reg. "We built up use cases around everything from call centers and customer service to CCTV surveillance for the national highway system."
"We're in a very strong position to take those learnings, capabilities, and improvements we've done over the years and bring that out to other regions, outside of Korea, as less of a fresh start, but more of a rinse and repeat type of motion," he added.
Following the introduction of its Rebel Quad accelerators, since rebranded as the Rebel100, the company has turned its attention to the rest of the world. Over the past few months, Rebellions has opened offices in Japan, Saudi Arabia, Taiwan, and the US, where it hopes to win over enterprises with its new RebelRack and RebelPods.
Before looking at the racks, let's talk about the chips themselves. Our sibling site The Next Platform dug into the Rebel100 last winter, but at a high level, the chip looks quite similar to Nvidia's H200 accelerators from late 2023.
According to Rebellions, the processor is capable of a petaFLOP of dense 16-bit floating point math or double that at FP8. However, unlike the H200, which used a monolithic compute die fabbed at TSMC, Rebellions' latest processor uses a chiplet architecture with four compute dies manufactured and packaged by Samsung.
That processor is fed by four HBM3e stacks totaling 144 GB of capacity and 4.8 TB/s of aggregate bandwidth.
While the smaller compute dies and reliance on Samsung should not only help with yields and avoid competing for TSMC's limited fab and packaging capacity, it still needs to source HBM from somewhere. Memory is already in short supply and HBM is among the scarcest.
This is where being a South Korean company with close ties to both the SK chaebol and Samsung comes in handy. SK Hynix and Samsung are the largest suppliers of HBM in the world. Last we heard, Rebellions was sourcing its HBM from Samsung, but in a pinch it shouldn't have to fight that hard to get SK Hynix to kick in some capacity.
The chip itself is currently being packaged as a PCIe card with a 600 watt TDP, rather than the OAM or SXM modules we've become accustomed to.
Rebellions' reference design calls for eight of these cards to be crammed into a single air-cooled node.
High-efficiency, standard form factors such as 19-inch chassis and air cooling were key design points for Rebellions as it meant the system could be deployed into existing enterprise datacenters, something that can't be said of Nvidia's latest generation of liquid-cooled Rubin GPUs.
The RebelRack will feature four of these nodes, each connected via quad-400 Gbps networking, for a total of 32 accelerators and 64 petaFLOPS of FP8 compute, 4.6 TB of HBM3e, and 153.6 TB/s of aggregate memory bandwidth.
For larger deployments, Rebellions is also developing what it calls the RebelPod, which can scale from eight to 128 nodes, each with eight Rebel100 accelerators interconnected using 800 Gbps Ethernet.
"Right now, people think of rack level. I think we're going to be thinking, in a few days from now, about row level and datacenter level," Choy said.
Compared to GPU systems, this isn't a lot of networking. Most HGX systems now feature at least one 800 Gbps NIC per GPU. Choy tells us that going forward, the network fabric is going to be a major focus for the company.
- Alibaba has made 470,000 AI chips, admits they're inferior and may always be
- Decoding Nvidia's Groq-powered LPX and the rest of its new rack systems
- Meta reveals four Broadcom-built custom AI chips, claims some outperform commercial silicon
- Nvidia-backed photonics startup Ayar Labs fills its wallet to mass-produce CPO chiplets
As we've seen with other rack-scale systems from AMD and Nvidia, compute and networking are only two pieces of the puzzle; you also need software that can stitch everything together cohesively.
Rebellions' software stack is nothing exotic. We're told the platform runs on open source frameworks like vLLM, PyTorch, and Triton. For disaggregated inference, it's using llm-d, another open source framework that enables compute-heavy prefill operations on one set of accelerators and memory bandwidth-heavy decode operations on another.
"Everything's open source, from vLLM compiler all the way up to the very highest level of stack, Red Hat, OpenShift, and everything in between," Choy said. "If you've used any of these technologies in any other context, you already know how to use Rebellions."
We've heard similar claims from chipmakers before that haven't ended up being quite so easy to use. However, Rebellions is a member of the PyTorch Foundation, something that can't be said of many AI chip startups.
Of course, none of this is cheap, but Rebellions isn't hurting for cash. On Monday the startup raised $400 million in a pre-IPO funding round led by Mirae Asset Financial Group and the Korea National Growth Fund to both support its expansion westward and further the development of more capable of and efficient AI accelerators and systems.
According to recent reports, the company could file for an IPO as soon as this year or early next year. ®



