Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class

MarkTechPost / 5/7/2026

📰 NewsSignals & Early TrendsIndustry & Market MovesModels & Research

共有:

Key Points

Zyphra has released ZAYA1-8B, a reasoning Mixture of Experts (MoE) model with only 760M active parameters that performs far better than many larger open-weight models on math and coding benchmarks.
The model reportedly approaches DeepSeek-V3.2 levels and surpasses Claude 4.5 Sonnet on the HMMT’25 benchmark using a novel Markovian RSA test-time compute method.
ZAYA1-8B was trained end-to-end on AMD Instinct MI300 hardware, highlighting improved results achievable with AMD platforms.
The release is provided under the Apache 2.0 license, enabling broader adoption and experimentation with a “high intelligence density” small-model weight class.

Zyphra releases ZAYA1-8B, a reasoning Mixture of Experts model with only 760M active parameters that outperforms open-weight models many times its size on math and coding benchmarks — closing in on DeepSeek-V3.2 and surpassing Claude 4.5 Sonnet on HMMT'25 with its novel Markovian RSA test-time compute method. Trained end-to-end on AMD Instinct MI300 hardware and released under Apache 2.0, it sets a new standard for intelligence density in the small language model weight class.

The post Zyphra Releases ZAYA1-8B: A Reasoning MoE Trained on AMD Hardware That Punches Far Above Its Weight Class appeared first on MarkTechPost.