filename     : Zha25a.pdf
entry        : inproceedings
conference   : SIGGRAPH 2025, Vancouver, Canada, 10-14 August, 2025
pages        : 138:1-138:11
year         : 2025
month        : August
title        : High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion
subtitle     :
author       : Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers
booktitle    : Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers
ISSN/ISBN    : 9798400715402
editor       : 
publisher    : Association for Computing Machinery
publ.place   : New York, NY, USA
volume       :
issue        :
language     : English
keywords     : Novel View Synthesis, Diffusion Model, Stereo Conversion
abstract     : Despite recent advances in Novel View Synthesis (NVS), generating high-fidelity views from single or sparse observations remains challenging. Existing splatting-based approaches often produce distorted geometry due to splatting errors. While diffusion-based methods leverage rich 3D priors to achieve improved geometry, they often suffer from texture hallucination. In this paper, we introduce SplatDiff, a pixel-splatting-guided video diffusion model designed to synthesize high-fidelity novel views from a single image. Specifically, we propose an aligned synthesis strategy for precise control of target viewpoints and geometry-consistent view synthesis. To mitigate texture hallucination, we design a texture bridge module that enables high-fidelity texture generation through adaptive feature fusion. In this manner, SplatDiff leverages the strengths of splatting and diffusion for geometrically consistent, high-fidelity view synthesis. Extensive experiments verify the state-of-the-art performance of SplatDiff in single-view NVS. Additionally, without extra training, SplatDiff shows remarkable zero-shot performance across diverse tasks, including sparse-view NVS and stereo video conversion.