| Abstract | Deep generative models such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion models have demonstrated their efficacy in generating 2D images and 3D meshes. However, interpreting and controlling the learned latent space is very difficult, severely limiting the utility of these methods. Worse, it has been shown that fully disentangling the latent space using only unsupervised methods is theoretically infeasible.
In this work, we introduce a novel method for latent space disentanglement on 3D meshes that achieves interpretability, control, and strong disentanglement. Our method comprises two components: a learned feature function for predicting 3D mesh features, and a generative model that predicts not only the desired meshes but also their features and feature gradients. We employ feature gradients as part of the loss function to promote disentanglement. Experimental results demonstrate that our disentanglement method is highly effective and achieves strong disentanglement without compromising the accuracy of the reconstruction. |
|---|