Ternary Bonsai: Top intelligence at 1.58 bits

Reddit r/LocalLLaMA / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

Prism ML has announced “Ternary Bonsai,” a new family of 1.58-bit language models aimed at maintaining high accuracy under strict memory limits.
The approach builds on earlier 1-bit Bonsai models, targeting a different efficiency point with a modest size increase to deliver meaningful performance gains.
Models are released in three parameter sizes (8B, 4B, and 1.7B) using ternary weights {-1, 0, +1}, achieving about a 9x smaller memory footprint than standard 16-bit models.
In benchmark comparisons, Ternary Bonsai is reported to outperform most peers within the corresponding parameter classes.
The HF collection provides FP16 “safetensors” for the ternary Bonsai-8B for compatibility, while a packed MLX 2-bit format is currently the only packed option, with more backend formats planned.

Ternary Bonsai: Top intelligence at 1.58 bits

Today, we’re announcing Ternary Bonsai, a new family of 1.58-bit language models designed to balance strict memory constraints with high accuracy requirements.

This release builds on the efficiency frontier we began exploring with the recently released 1-bit Bonsai models. The 1-bit family showed that extreme compression could still produce commercially useful language models. Ternary Bonsai targets a different point on that curve: a modest increase in size for a meaningful gain in performance.

The models are available in three sizes: 8B, 4B, and 1.7B parameters. By using ternary weights {-1, 0, +1}, these models achieve a memory footprint approximately 9x smaller than standard 16-bit models while outperforming most peers in their respective parameter classes on standard benchmarks.

Blog post : https://prismml.com/news/ternary-bonsai

Models : https://huggingface.co/collections/prism-ml/ternary-bonsai

FP16 safetensors (HuggingFace format) of the ternary Bonsai-8B model. This repo exists for users who want to run Ternary Bonsai with stock HuggingFace tooling or frameworks that don't yet support any of the packed ternary format. The MLX 2-bit format is currently the only packed format available; more formats for other backends are coming soon.

Hope these ternary Bonsai models come with no/less hallucinations.

Waiting for 20-40B models(like Qwen3.5-27B, Qwen3.5-35B-A3B, Gemma-4-31B, Gemma-4-26B-A4B, etc.,) from them soon! That would be start of game change for big/large models.

submitted by /u/pmttyji
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

Small NSFW model for chatbot

Reddit r/LocalLLaMA

ChatGPT for Nurses: Prompts That Help You Document, Communicate, and Study

Dev.to

I Added a Stopwatch to My AI in 1 LOC Using the Livingrimoire While Corporations Need a Year

Dev.to

Ternary Bonsai: Top intelligence at 1.58 bits

Key Points

Related Articles

Black Hat USA

Black Hat Asia

Small NSFW model for chatbot

ChatGPT for Nurses: Prompts That Help You Document, Communicate, and Study

I Added a Stopwatch to My AI in 1 LOC Using the Livingrimoire While Corporations Need a Year

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer