RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language'

Reddit r/LocalLLaMA / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The author reports findings from running H100-based experiments suggesting that LLMs may form “universal language” latent representations in mid-layer transformer blocks, with Chinese and English representations for the same content becoming more similar than representations across different content within a language.
They conclude that repeating blocks in the middle portion of the transformer stack performs best compared with other approaches they tried.
The post shares multiple new released model variants based on Qwen3.5 27B (FP8 tiers) on Hugging Face for others to test.
The author expects that fine-tuning the largest repeated-layer variant (FP8-XL) could achieve new state-of-the-art results in its model-size range.
They are also discussing a future packaging/format that keeps duplicated layers as copies to reduce additional VRAM usage beyond the KV cache.

RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language'

So, I've had my H100s grind for you all, and have some interesting new results AND fresh models!

So, what did I find? Well because my blog article are too damn long (I know some of you are not reading the whole thing...), here is a TL;DR:

I found that LLMs seem to think in a universal language. During the middle layers, the models latent representations are more similar on the same content in Chinese and English than different content in the same language.
I tried a bunch of different stuff, but in the end, repeating blocks in the middle of the transformer stack works the best.
You should still read the blog: https://dnhkng.github.io/posts/rys-ii/

If you still didnt read the blog, well, I guess you can just try the models?

https://huggingface.co/dnhkng/RYS-Qwen3.5-27B-FP8-S

https://huggingface.co/dnhkng/RYS-Qwen3.5-27B-FP8-M

https://huggingface.co/dnhkng/RYS-Qwen3.5-27B-FP8-L

https://huggingface.co/dnhkng/RYS-Qwen3.5-27B-FP8-XL

Wen GGUF? When someone GGUF's them I guess?

When you repeat layers, you benefit a lot from fine tuning. I expect the first team to fine tune RYS-Qwen3.5-27B-FP8-XL will have a new SOTA for that size range. Lastly, Ive been chatting with TurboDerp; hopefully we can get this into a new format where you can keep the duplicated later as copies, and not use more VRAM (except for the KV cache). Stay tuned!

submitted by /u/Reddactor
[link] [comments]

The Moonwell Oracle Exploit: How AI-Assisted 'Vibe Coding' Turned cbETH Into a $1.12 Token and Cost $1.78M

Dev.to

How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command

Dev.to

Day 10: An AI Agent's Revenue Report — $29, 25 Products, 160 Tweets

Dev.to

Does Synthetic Data Generation of LLMs Help Clinical Text Mining?

Dev.to

What CVE-2026-25253 Taught Me About Building Safe AI Assistants

Dev.to

RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language'

Key Points

Related Articles

The Moonwell Oracle Exploit: How AI-Assisted 'Vibe Coding' Turned cbETH Into a $1.12 Token and Cost $1.78M

How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command

Day 10: An AI Agent's Revenue Report — $29, 25 Products, 160 Tweets

Does Synthetic Data Generation of LLMs Help Clinical Text Mining?

What CVE-2026-25253 Taught Me About Building Safe AI Assistants

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer