filename     : Pra22a.pdf
entry        : article
conference   : Eurographics 2022
pages        : 13
year         : 2022
month        : April
title        : Shape Transformers: Topology-Independent 3D Shape Models Using Transformers
subtitle     :
author       : Prashanth Chandran, Gaspard Zoss, Markus Gross, Paulo Gotardo, Derek Bradley
booktitle    : Computer Graphics Forum
ISSN/ISBN    : 
editor       : Computer Graphics Forum, The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd
publisher    : Computer Graphics Forum
publ.place   :
volume       : 41
issue        : 2
language     : English
keywords     : Shape models, Digital humans, Transformers, Topology, Deep learning
abstract     : Parametric 3D shape models are heavily utilized in computer graphics and vision applications to provide priors on the observed
variability of an object's geometry (e.g., for faces). Original models were linear and operated on the entire shape at once. They
were later enhanced to provide localized control on different shape parts separately. In deep shape models, nonlinearity was
introduced via a sequence of fully-connected layers and activation functions, and locality was introduced in recent models that
use mesh convolution networks. As common limitations, these models often dictate, in one way or another, the allowed extent of
spatial correlations and also require that a fixed mesh topology be specified ahead of time. To overcome these limitations, we
present Shape Transformers, a new nonlinear parametric 3D shape model based on transformer architectures. A key benefit of
this new model comes from using the transformer's self-attention mechanism to automatically learn nonlinear spatial correlations for a class of 3D shapes. This is in contrast to global models that correlate everything and local models that dictate the
correlation extent. Our transformer 3D shape autoencoder is a better alternative to mesh convolution models, which require
specially-crafted convolution, and down/up-sampling operators that can be difficult to design. Our model is also topologically
independent: it can be trained once and then evaluated on any mesh topology, unlike most previous methods. We demonstrate
the application of our model to different datasets, including 3D faces, 3D hand shapes and full human bodies. Our experiments
demonstrate the strong potential of our Shape Transformer model in several applications in computer graphics and vision.