共有:
AI Chips

OpenAI Launches Jalapeno Chip to Cut Inference Costs

Structural pressure on inference costs has arrived. A new axis of custom silicon enters the GPU-dominated era.

AI Navigate Editorial·2026.06.25·6 min read
GPU Stack (Legacy) NVIDIA H100 CUDA Runtime General-Purpose Architecture ASIC Stack (Jalapeno) Jalapeno ASIC (Inference-Only) OpenAI Custom Runtime LLM Inference-Optimized Architecture Manufactured by Broadcom · Designed by OpenAI
01
Background

Why GPUs Were the Only Option

Until now, cost reduction meant adding more NVIDIA GPUs, and the perception was that only Google and Meta were building custom chips. AI inference assumes vertical integration of chip design, manufacturing, and the software stack — requiring upfront investment in the hundreds of billions.

Google has TPUs, Meta has MTIA, but access for outside parties remains limited. Startups and mid-sized labs had no choice but to ride NVIDIA's ecosystem, leaving GPU procurement costs and CUDA lock-in as structural industry problems.

As of 2024, GPU dependency in the AI inference market was estimated at over 90%. The shift to custom silicon had become a race of who gets there first, not if.

02
By the Numbers

What Jalapeno Changes — and What It Doesn't

$0
Immediate API price change
6-12mo
Realistic timeline for price decline
×3
Major labs with custom chips (Google / Meta / OpenAI)

OpenAI and Broadcom launched 'Jalapeno,' a custom LLM inference ASIC, moving OpenAI's serving stack beyond pure GPU dependence (OpenAI official).

Jalapeno is an ASIC purpose-built for LLM inference workloads, designed to dramatically improve memory bandwidth utilization and power efficiency compared to general-purpose GPUs. Broadcom's advanced manufacturing processes are combined with an architecture defined by OpenAI to match the tensor operation patterns of its models.

03
Impact & Outlook

What This Means for Users

The immediate impact on API pricing is undetermined. Structural cost-reduction pressure is now in play, so prices should come down over a 6–12 month horizon, but for individual users the near-term difference will be negligible.

The bigger shift is competitive. With custom silicon, OpenAI's negotiating position with NVIDIA changes. Lower inference costs mean room to reach more users with AI at scale.

Jalapeno shatters the assumption that AI chip design is a privilege of Google and Meta. For other large labs, it is a strong signal to accelerate their own ASIC programs.


Sources: OpenAI official announcement · Broadcom press release · 2026.06.25

AI Navigate — Daily Update · 2026.06.25