MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

arXiv cs.CL / 4/28/2026

📰 NewsModels & Research

共有:

Key Points

The paper introduces MIPIC, a new unified training framework to learn Matryoshka Representation Learning (MRL) embeddings that remain coherent across different embedding dimensions and model depths.
MIPIC uses Self-Distilled Intra-Relational Alignment (SIA) to enforce cross-dimension structural consistency by aligning token-level geometric and attention-driven relations between full and truncated representations via top-k CKA self-distillation.
It also applies Progressive Information Chaining (PIC) to consolidate semantics across layers by gradually transferring task understanding from deeper layers to earlier layers.
Experiments on STS, NLI, and classification benchmarks (including model sizes ranging from TinyBERT to BGEM3 and Qwen3) show that MIPIC produces strong Matryoshka representations, especially improving performance under extremely low embedding dimensions.
Overall, the work addresses the coordination challenge in MRL—how information is arranged across dimensionality and depth—by providing training strategies for both structural alignment and semantic transfer.

Abstract

Representation learning is fundamental to NLP, but building embeddings that work well at different computational budgets is challenging. Matryoshka Representation Learning (MRL) offers a flexible inference paradigm through nested embeddings; however, learning such structures requires explicit coordination of how information is arranged across embedding dimensionality and model depth. In this work, we propose MIPIC (Matryoshka Representation Learning via Self-Distilled Intra-Relational Alignment and Progressive Information Chaining), a unified training framework designed to produce structurally coherent and semantically compact Matryoshka representations. MIPIC promotes cross-dimensional structural consistency through Self-Distilled Intra-Relational Alignment (SIA), which aligns token-level geometric and attention-driven relations between full and truncated representations using top-k CKA self-distillation. Complementarily, it enables depth-wise semantic consolidation via Progressive Information Chaining (PIC), a scaffolded alignment strategy that incrementally transfers mature task semantics from deeper layers into earlier layers. Extensive experiments on STS, NLI, and classification benchmarks (spanning models from TinyBERT to BGEM3, Qwen3) demonstrate that MIPIC yields Matryoshka representations that are highly competitive across all capacities, with significant performance advantages observed under extreme low-dimensional.

LLMs will be a commodity

Reddit r/artificial

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Voice Agents in Production: What Actually Works in 2026

Dev.to

How we built a browser-based AI Pathology platform

Dev.to

MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining

Key Points

Abstract

Related Articles

LLMs will be a commodity

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Voice Agents in Production: What Actually Works in 2026

How we built a browser-based AI Pathology platform

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer