REZE: Representation Regularization for Domain-adaptive Text Embedding Pre-finetuning
arXiv cs.CL / 4/21/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that contrastive pre-finetuning (PFT) on heterogeneous, scattered domain tasks can inject task-induced bias that causes uncontrolled representation shifts and degrades embedding performance.
- It introduces REZE, a representation regularization method that constrains representation shift during embedding pre-finetuning by analyzing anchor-positive pair relations in an eigenspace.
- REZE measures task-wise dispersion per eigencomponent to find task-variant directions, then applies adaptive soft-shrinkage to suppress task-specific noise while preserving task-invariant semantic structure.
- Experiments across multiple embedding backbones and specialized benchmarks show REZE generally outperforms standard PFT and isotropy-based post-hoc regularization, and maintains stability where existing PFT variants may collapse.
- Additional embedding-space analyses indicate that REZE produces controlled shifts that align with the original embedding manifold, supporting the idea that representation-shift control is crucial for robust domain-adaptive embedding pre-finetuning.
Related Articles

Every time a new model comes out, the old one is obsolete of course
Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆
Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)
Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims
Dev.to

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM
Reddit r/LocalLLaMA