CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting

Jagiellonian University
*equal contribution

Abstract

Gaussian Splatting (GS) has recently emerged as an efficient representation for rendering 3D scenes from 2D images and has been extended to images, videos, and dynamic 4D content. However, applying style transfer to GS-based representations, especially beyond simple color changes, remains challenging. In this work, we introduce CLIPGaussians, the first unified style transfer framework that supports text- and image-guided stylization across multiple modalities: 2D images, videos, 3D objects, and 4D scenes. Our method operates directly on Gaussian primitives and integrates into existing GS pipelines as a plug-in module, without requiring large generative models or retraining from scratch. CLIPGaussians approach enables joint optimization of color and geometry in 3D and 4D settings, and achieves temporal coherence in videos, while preserving a model size. We demonstrate superior style fidelity and consistency across all tasks, validating CLIPGaussians as a universal and efficient solution for multimodal style transfer.

Universal Style Transfer

CLIPGaussian is the first plug-in style transfer model for Gaussian Splatting, enabling image- and text-guided stylization across 2D, video, 3D, and 4D data without retraining the base model.

3D

4D

Video

Images

Interpolation

CLIPGaussian does not perform densification or alter the number of Gaussian components. As a result, the stylized objects retain the same size as the original. Moreover, this property enables style interpolation by linearly interpolating the parameters of each Gaussian component.

Interpolate start reference image.

Black marble

Loading...
Interpolation end reference image.

Green crystal


BibTeX

@Article{howil2025clipgaussian,
      author={Kornel Howil and Joanna Waczyńska and Piotr Borycki and Tadeusz Dziarmaga and Marcin Mazur and Przemysław Spurek},
      title={CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting},
      year={2025},
      eprint={2505.22854},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.22854}, 
}

Acknowledgements

The project “Effective rendering of 3D objects using Gaussian Splatting in an Augmented Reality environment” (FENG.02.02-IP.05-0114/23) is carried out within the First Team programme of the Foundation for Polish Science co-financed by the European Union under the European Funds for Smart Economy 2021-2027 (FENG). FNP