Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies

arXiv cs.AI / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper argues that traditional LLM personalization mixes user data into shared weights, making individual data removal effectively infeasible without retraining.
  • It proposes a three-layer “Separable Expert Architecture” that uses a static base model, composable domain-expert LoRA adapters, and per-user proxy artifacts so that deleting a user’s proxy deterministically unlearns that user.
  • Experiments on Phi-3.5-mini and Llama-3.1-8B show per-user differentiated outputs driven by personal data while keeping strong isolation between users, with baseline recovery after proxy removal.
  • The authors claim the design mitigates privacy attacks such as model inversion, membership inference, and training-data extraction because personal information never enters the shared weights.
  • The approach reframes “machine unlearning” as a deterministic deletion operation and is positioned as compatible with DP-SGD to improve privacy-preserving shared model training.

Abstract

Current model training approaches incorporate user information directly into shared weights, making individual data removal computationally infeasible without retraining. This paper presents a three-layer architecture that decouples personal data from shared weights by combining a static base model, composable domain-expert LoRA adapters that shape behavior without imparting user data, and per-user proxy artefacts whose deletion constitutes deterministic unlearning. Evaluation on Phi-3.5-mini and Llama-3.1-8B confirms per-user differentiation in which personal data influences outputs while remaining isolated, verified by a return to baseline after proxy removal (KL divergence of approximately 0.21 nats, 82-89% verification pass rate) and near-zero cross-user contamination. Because user-specific information never enters shared weights, the architecture mitigates model inversion, membership inference, and training-data extraction against shared model components by construction. The approach converts machine unlearning from an intractable weight-editing problem into a deterministic deletion operation that preserves personalization alongside privacy-enhancing guarantees and is compatible with differentially private stochastic gradient descent (DP-SGD) for privacy-preserving shared model improvement.