Domain Mixture Design via Log-Likelihood Differences for Aligning Language Models with a Target Model

arXiv cs.CL / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes aligning a base language model with a target model by designing the domain weights in the training data for pretraining or continued pretraining as a fixed recipe.
It treats models as points in log-likelihood space and aligns the training update direction with the vector toward the target model to minimize divergence.
Experiments with NanoGPT show the domain-weighting method reduces KL divergence to the target model compared with uniform weighting over the Pile.
While knowledge distillation remains more effective when available, the method yields meaningful alignment and often brings downstream task performance closer to the target.

Abstract

Instead of directly distilling a language model, this study addresses the problem of aligning a base model with a target model in distribution by designing the domain mixture of training data for pretraining or continued pretraining as a fixed training recipe. We propose a method for determining domain weights by viewing models as points in log-likelihood space and aligning the training update direction with the direction toward the target model. Experiments with NanoGPT show that the proposed method consistently reduces the KL divergence to the target model compared with uniform weighting over the Pile. Although knowledge distillation remains more effective when available, the proposed method still achieves meaningful alignment, and downstream task performance also tends to become closer to that of the target model.

Is AI becoming a bubble, and could it end like the dot-com crash?

Reddit r/artificial

Externalizing State

Dev.to

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

Dev.to

My AI Does Not Have a Clock

Dev.to

How to settle on a coding LLM ? What parameters to watch out for ?

Reddit r/LocalLLaMA

Domain Mixture Design via Log-Likelihood Differences for Aligning Language Models with a Target Model

Key Points

Abstract

Related Articles

Is AI becoming a bubble, and could it end like the dot-com crash?

Externalizing State

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

My AI Does Not Have a Clock

How to settle on a coding LLM ? What parameters to watch out for ?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer