Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

Hacker News / 3/19/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The author replicated David Ng's RYS method on consumer GPUs and found that duplicating contiguous blocks of 3-4 transformer layers creates discrete "reasoning circuits" that make the model run its reasoning pipeline twice without changing weights or training.
On 24B models, duplicating specific layers significantly improved benchmarks (BBH Logical Deduction from 0.22 to 0.76, GSM8K 0.48 to 0.64, MBPP 0.72 to 0.78) with no degradation observed.
Different duplication patterns yield different cognitive modes (double-pass boosts math, triple-pass boosts emotional reasoning, interleaved doubling yields a math-specialist mode); shifting the cut by one layer can negate or invert the effect.
The post provides tools to identify circuits in GGUF models and apply arbitrary layer routing, with the entire sweep/validation completed in about one evening.

I replicated David Ng's RYS method (https://dnhkng.github.io/posts/rys/) on consumer AMD GPUs (RX 7900 XT + RX 6950 XT) and found something I didn't expect.

Transformers appear to have discrete "reasoning circuits" — contiguous blocks of 3-4 layers that act as indivisible cognitive units. Duplicate the right block and the model runs its reasoning pipeline twice. No weights change. No training. The model just thinks longer.

The results on standard benchmarks (lm-evaluation-harness, n=50):

Devstral-24B, layers 12-14 duplicated once: - BBH Logical Deduction: 0.22 → 0.76 - GSM8K (strict): 0.48 → 0.64 - MBPP (code gen): 0.72 → 0.78 - Nothing degraded

Qwen2.5-Coder-32B, layers 7-9 duplicated once: - Reasoning probe: 76% → 94%

The weird part: different duplication patterns create different cognitive "modes" from the same weights. Double-pass boosts math. Triple-pass boosts emotional reasoning. Interleaved doubling (13,13,14,14,15,15,16) creates a pure math specialist. Same model, same VRAM, different routing.

The circuit boundaries are sharp — shift by one layer and the effect disappears or inverts. Smaller models (24B) have tighter circuits (3 layers) than larger ones (Ng found 7 layers in 72B).

Tools to find circuits in any GGUF model and apply arbitrary layer routing are in the repo. The whole thing — sweep, discovery, validation — took one evening.

Happy to answer questions.

Comments URL: https://news.ycombinator.com/item?id=47431671

Points: 112

# Comments: 37

Manus、AIエージェントをデスクトップ化ローカルPC上でファイルやアプリを直接操作可能にのサムネイル画像

Ledge.ai

The programming passion is melting

Dev.to

Best AI Tools for Property Managers in 2026

Dev.to

Building “The Sentinel” – AI Parametric Insurance at Guidewire DEVTrails

Dev.to

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

Dev.to

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

Key Points

Related Articles

Manus、AIエージェントをデスクトップ化ローカルPC上でファイルやアプリを直接操作可能にのサムネイル画像

The programming passion is melting

Best AI Tools for Property Managers in 2026

Building “The Sentinel” – AI Parametric Insurance at Guidewire DEVTrails

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

Manus、AIエージェントをデスクトップ化 ローカルPC上でファイルやアプリを直接操作可能にのサムネイル画像

The programming passion is melting

Best AI Tools for Property Managers in 2026

Building “The Sentinel” – AI Parametric Insurance at Guidewire DEVTrails

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Manus、AIエージェントをデスクトップ化ローカルPC上でファイルやアプリを直接操作可能にのサムネイル画像