|
This release builds on the efficiency frontier we began exploring with the recently released 1-bit Bonsai models. The 1-bit family showed that extreme compression could still produce commercially useful language models. Ternary Bonsai targets a different point on that curve: a modest increase in size for a meaningful gain in performance. The models are available in three sizes: 8B, 4B, and 1.7B parameters. By using ternary weights {-1, 0, +1}, these models achieve a memory footprint approximately 9x smaller than standard 16-bit models while outperforming most peers in their respective parameter classes on standard benchmarks. Blog post : https://prismml.com/news/ternary-bonsai Models : https://huggingface.co/collections/prism-ml/ternary-bonsai
Hope these ternary Bonsai models come with no/less hallucinations. Waiting for 20-40B models(like Qwen3.5-27B, Qwen3.5-35B-A3B, Gemma-4-31B, Gemma-4-26B-A4B, etc.,) from them soon! That would be start of game change for big/large models. [link] [comments] |
Ternary Bonsai: Top intelligence at 1.58 bits
Reddit r/LocalLLaMA / 4/17/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- Prism ML has announced “Ternary Bonsai,” a new family of 1.58-bit language models aimed at maintaining high accuracy under strict memory limits.
- The approach builds on earlier 1-bit Bonsai models, targeting a different efficiency point with a modest size increase to deliver meaningful performance gains.
- Models are released in three parameter sizes (8B, 4B, and 1.7B) using ternary weights {-1, 0, +1}, achieving about a 9x smaller memory footprint than standard 16-bit models.
- In benchmark comparisons, Ternary Bonsai is reported to outperform most peers within the corresponding parameter classes.
- The HF collection provides FP16 “safetensors” for the ternary Bonsai-8B for compatibility, while a packed MLX 2-bit format is currently the only packed option, with more backend formats planned.

