Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation

arXiv cs.CV / 3/24/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Premier is a new preference-modulation framework for personalized text-to-image generation that learns a dedicated embedding for each user’s preferences rather than relying on inferred prompts or latent codes from multimodal LLMs.
The method uses a preference adapter to fuse the user embedding with the text prompt and then further applies the fused preference embedding to modulate the generative process for more fine-grained control.
To improve personalization quality and avoid users collapsing to similar representations, Premier introduces a dispersion loss that enforces separation among different users’ embeddings.
It supports scarce user data by representing new users as linear combinations of existing learned preference embeddings, aiming to generalize personalization.
Experiments (including text consistency, ViPer proxy metrics, and expert evaluations) report better preference alignment and overall performance than prior approaches under the same preference-history length.

Abstract

Text-to-image generation has advanced rapidly, yet it still struggles to capture the nuanced user preferences. Existing approaches typically rely on multimodal large language models to infer user preferences, but the derived prompts or latent codes rarely reflect them faithfully, leading to suboptimal personalization. We present Premier, a novel preference modulation framework for personalized image generation. Premier represents each user's preference as a learnable embedding and introduces a preference adapter that fuses the user embedding with the text prompt. To enable accurate and fine-grained preference control, the fused preference embedding is further used to modulate the generative process. To enhance the distinctness of individual preference and improve alignment between outputs and user-specific styles, we incorporate a dispersion loss that enforces separation among user embeddings. When user data are scarce, new users are represented as linear combinations of existing preference embeddings learned during training, enabling effective generalization. Experiments show that Premier outperforms prior methods under the same history length, achieving stronger preference alignment and superior performance on text consistency, ViPer proxy metrics, and expert evaluations.

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

I made a new programming language to get better coding with less tokens.

Dev.to

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Dev.to

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

Reddit r/artificial

Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation

Key Points

Abstract

Related Articles

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

I made a new programming language to get better coding with less tokens.

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer