GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic
arXiv cs.CV / 4/13/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies a “P2P gap” between physically-based rendering (PBR) and photorealistic rendering (PRR), where PBR’s physical correctness still depends on realistic geometry/materials to achieve true photorealism.
- It proposes GeRM, a first multi-modal generative rendering model that unifies PBR and PRR by bridging the transition as a distribution transfer learned via a distribution transfer vector field (DTV Field).
- GeRM uses physical representations (G-buffers) together with text prompts, along with a progressive incremental injection strategy, to generate controllable photorealistic images while navigating the continuum between fidelity and perceptual realism.
- The approach builds an expert-guided paired dataset, P2P-50K, using a multi-agent VLM framework to create transfer pairs that supervise learning of the vector field.
- A multi-condition ControlNet is introduced to learn and apply the DTV Field, progressively transforming PBR images into PRR outputs guided by G-buffers, prompts, and region-focused cues.
Related Articles

Black Hat Asia
AI Business

Apple is building smart glasses without a display to serve as an AI wearable
THE DECODER

Why Fashion Trend Prediction Isn’t Enough Without Generative AI
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
Chatbot vs Voicebot: The Real Business Decision Nobody Talks About
Dev.to