AI Navigate

Persistent Story World Simulation with Continuous Character Customization

arXiv cs.CV / 3/18/2026

📰 NewsModels & Research

Key Points

  • Introduces EverTale, a story world simulator designed for continuous character customization in visual storytelling.
  • Proposes an All-in-One-World Character Integrator that uses a unified LoRA module to enable continuous character adaptation without per-character optimization modules.
  • Implements a Character Quality Gate via MLLM-as-Judge to ensure fidelity of character adaptation through chain-of-thought reasoning and determine training needs per character.
  • Presents a Character-Aware Region-Focus Sampling strategy to address identity degradation and layout conflicts by harmonizing local character details with global scene context.
  • Reports experimental results showing superior performance against various baselines for both single- and multi-character story visualization, with code availability promised.

Abstract

Story visualization has gained increasing attention in computer vision. However, current methods often fail to achieve a synergy between accurate character customization, semantic alignment, and continuous integration of new identities. To tackle this challenge, in this paper we present EverTale, a story world simulator for continuous story character customization. We first propose an All-in-One-World Character Integrator to achieve continuous character adaptation within unified LoRA module, eliminating the need for per-character optimization modules of previous methods. Then, we incorporate a Character Quality Gate via MLLM-as-Judge to ensure the fidelity of each character adaptation process through chain-of-thought reasoning, determining whether the model can proceed to the next character or require additional training on the current one. We also introduce a Character-Aware Region-Focus Sampling strategy to address the identity degradation and layout conflicts in existing multi-character visual storytelling, ensuring natural multi-character generation by harmonizing local character-specific details with global scene context with higher efficiency. Experimental results show that our EverTale achieves superior performance against a wider range of compared methods on both single- and multi-character story visualization. Codes will be available.