Guénolé Fiche
gfiche.bsky.social
Guénolé Fiche
@gfiche.bsky.social
Research Scientist at Naver Labs Europe.

Human-centric 3D computer vision
MEGA is the first method that achieves SOTA results in both single and multi-output HMR. Want to try it yourself? Code and demo are available at: g-fiche.github.io/research-pag...

Work done in collaboration with @sleglaive.bsky.social , @xavirema.bsky.social , and Francesc Moreno-Noguer. (6/6)
March 19, 2025 at 7:51 AM
We propose 2 generation modes:
- In deterministic mode, MEGA predicts all tokens in a single forward pass, ensuring speed and accuracy.
- In stochastic mode we iteratively sample human mesh tokens, enabling MEGA to produce multiple predictions from a single image. (5/6)
March 19, 2025 at 7:51 AM
Subsequently, we add image conditioning and train MEGA to recover human meshes from image features and partial token sequences. During inference, we begin with a fully masked sequence of tokens and generate a human mesh conditioned on an input image. (4/6)
March 19, 2025 at 7:51 AM
MEGA is first pre-trained on motion capture data to recover human meshes from partial human mesh token sequences with a variable masking rate. Starting from an empty sequence, we are then able to generate random meshes showing high pose and shape diversity. (3/6)
March 19, 2025 at 7:51 AM
MEGA is a MaskEd Generative Autoencoder, which relies on a tokenized representation of human meshes. We frame HMR as generating a sequence of tokens corresponding to a human mesh, conditioned on an input image. (2/6)
March 19, 2025 at 7:51 AM