R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

Page view(s)

Checked on Aug 04, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/20433

Title:

R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

Journal Title:

IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

DOI:

10.1109/CVPR52733.2024.00981

Publication URL:

https://ieeexplore.ieee.org/document/10656939

Authors:

Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin

Keywords:

3D human reconstruction

Publication Date:

16 September 2024

Citation:

Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10304-10313

Abstract:

Recently, the authors of Zero-1-to-3 demonstrated that a latent diffusion model, pretrained with Internet-scale data, can not only address the single-view 3D object reconstruction task but can even attain SOTA results in it. However, when applied to the task of single-view 3D clothed human reconstruction, Zero-1-to-3 (and related models) are unable to compete with the corresponding SOTA methods in this field despite being trained on clothed human data. In this work, we aim to tailor Zero-1-to-3’s approach to the single-view 3D clothed human reconstruction task in a much more principled and structured manner. To this end, we propose R-Cyclic Diffuser, a framework that adapts Zero-1-to-3’s novel approach to clothed human data by fusing it with a pixel-aligned implicit model. R-Cyclic Diffuser offers a total of three new contributions. The first and primary contribution is R-Cyclic Diffuser’s cyclical conditioning mechanism for novel view synthesis. This mechanism directly addresses the view inconsistency problem faced by Zero-1-to-3 and related models. Secondly, we further enhance this mechanism with two key features - Lateral Inversion Constraint and Cyclic Noise Selection. Both features are designed to regularize and restrict the randomness of outputs generated by a latent diffusion model. Thirdly, we show how SMPL-X body priors can be incorporated in a latent diffusion model such that novel views of clothed human bodies can be generated much more accurately. Our experiments show that R-Cyclic Diffuser is able to outperform current SOTA methods in single-view 3D clothed human reconstruction both qualitatively and quantitatively. Our code is made publicly available at https://github.com/kcyt/r-cyclic-diffuser.

License type:

Publisher Copyright

Funding Info:

This research / project is supported by the Agency for Science, Technology and Research (A*STAR) - MTC Programmatic Funds
Grant Reference no. : M23L7b0021

Description:

© 2024 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

URI:

https://oar.a-star.edu.sg/communities-collections/articles/20433

ISSN:

2575-7075

Collections:

Institute for Infocomm Research

Files uploaded:

https://openaccess.thecvf.com/content/CVPR2024/papers/Chan_R-Cyclic_Diffuser_Reductive_and_Cyclic_Latent_Diffusion_for_3D_Clothed_CVPR_2024_paper.pdf