Implicit and Parametric Avatar Pose and Shape Estimation From a Single Frontal Image of a Clothed Human

Fares Mallek, Carlos Vázquez, and Eric Paquette, Proceedings of the ACM SIGGRAPH Conference on Motion, Interaction and Games (MIG), pages 1-11, 2024.

Abstract

In this paper, we tackle the challenge of three-dimensional estimation of expressive, animatable, and textured human avatars from a single frontal image. Leveraging a Skinned Multi-Person Linear (SMPL) parametric body, we adjust the model parameters to faithfully reflect the shape and pose of the individual, relying on the mesh generated by a Pixel-aligned Implicit Function (PIFu) model. To robustly infer the SMPL parameters, we deploy a multi-step optimization process. Initially, we recover the position of 2D joints using an existing pose estimation tool. Subsequently, we utilize the 3D PIFu mesh together with the 2D pose to estimate the 3D position of joints. In the subsequent step, we adapt the body's parametric model to the 3D joints through rigid alignment, optimizing for global translation and rotation. This step provides a robust initialization for further refinement of shape and pose parameters. The next step involves optimizing the pose and the first component of the SMPL shape parameters while imposing constraints to enhance model robustness. We then refine the SMPL model pose and shape parameters by adding two new registration loss terms to the optimization cost function: a point-to-surface distance and a Chamfer distance. Finally, we introduce a refinement process utilizing a deformation vector field applied to the SMPL mesh, enabling more faithful modeling of tight to loose clothing geometry. A notable advantage of our approach is the ability to generate detailed avatars with fewer vertices compared to previous research, enhancing computational efficiency while maintaining high fidelity. To complete our model, we design a texture extraction and completion approach. Our entirely automated approach was evaluated against recognized benchmarks, X-Avatar and PeopleSnapshot, showcasing competitive performance against state-of-the-art methods. This approach contributes to advancing 3D modeling techniques, particularly in the realms of interactive applications, animation, and video games. We made our code available to the community: https://github.com/ETS-BodyModeling/ImplicitParametricAvatar

Keywords

Human avatar, Reconstruction, SMPL-X, Optimization, 3D modeling, Parametric model, Animation, Textures, Computer vision

BibTeX entry

@inproceedings{Mallek2024,
    author = {Fares Mallek and Carlos V\'{a}zquez and Eric Paquette},
    title = {Implicit and Parametric Avatar Pose and Shape Estimation From a Single Frontal Image of a Clothed Human},
    booktitle={Proceedings of the 17th ACM SIGGRAPH Conference on Motion, Interaction and Games (MIG)},
    pages={1-11},
    year = {2024}
}

Online version

Official Open Access published paper on ACM Digital Library: https://dl.acm.org/doi/10.1145/3677388.3696328.

Preliminary version of the paper.

Source code

We made our code available to the community: https://github.com/ETS-BodyModeling/ImplicitParametricAvatar

Additional material

Slides of the presentation.

Pre-print version of the video: