CalliMaster: Mastering Page-level Chinese Calligraphy via Layout-guided Spatial Planning

arXiv cs.CV / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

CalliMaster presents a unified framework that decouples spatial planning from content synthesis to balance layout composition and brushwork in page-level Chinese calligraphy generation.
It uses a coarse-to-fine Text → Layout → Image pipeline inside a single Multimodal Diffusion Transformer, with an initial planning stage that predicts character bounding boxes to establish global spatial arrangement.
The predicted layout then serves as a geometric prompt for content synthesis, where flow-matching renders high-fidelity brushwork.
The disentangled design enables controllable semantic re-planning, allowing users to resize or reposition characters while the model harmonizes surrounding void space and brush momentum.
Beyond generation, the approach extends to artifact restoration and forensic analysis, supporting digital cultural heritage tasks.

Abstract

Page-level calligraphy synthesis requires balancing glyph precision with layout composition. Existing character models lack spatial context, while page-level methods often compromise brushwork detail. In this paper, we present \textbf{CalliMaster}, a unified framework for controllable generation and editing that resolves this conflict by decoupling spatial planning from content synthesis. Inspired by the human cognitive process of ``planning before writing'', we introduce a coarse-to-fine pipeline \textbf{(Text

\rightarrow

Layout

\rightarrow

Image)} to tackle the combinatorial complexity of page-scale synthesis. Operating within a single Multimodal Diffusion Transformer, a spatial planning stage first predicts character bounding boxes to establish the global spatial arrangement. This intermediate layout then serves as a geometric prompt for the content synthesis stage, where the same network utilizes flow-matching to render high-fidelity brushwork. Beyond achieving state-of-the-art generation quality, this disentanglement supports versatile downstream capabilities. By treating the layout as a modifiable constraint, CalliMaster enables controllable semantic re-planning: users can resize or reposition characters while the model automatically harmonizes the surrounding void space and brush momentum. Furthermore, we demonstrate the framework's extensibility to artifact restoration and forensic analysis, providing a comprehensive tool for digital cultural heritage.

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

The Research That Doesn't Exist

Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

Dev.to

CalliMaster: Mastering Page-level Chinese Calligraphy via Layout-guided Spatial Planning

Key Points

Abstract

Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

The Research That Doesn't Exist

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer