Dimitris Tzionas
dimtzionas.bsky.social
Dimitris Tzionas
@dimtzionas.bsky.social
Assistant Professor for 3D Computer Vision at University of Amsterdam.
3D Human-centric Perception & Synthesis: bodies, hands, objects.
Past: MPI for Intelligent Systems, Univ. of Bonn, Aristotle Univ. of Thessaloniki
Website: https://dtzionas.com
CWGrasp will be presented @3dvconf.bsky.social #3DV2025

Authors: G. Paschalidis, R. Wilschut, D. Antić, O. Taheri, D. Tzionas
Colab: University of Amsterdam, MPI for Intelligent Systems
Project: gpaschalidis.github.io/cwgrasp
Paper: arxiv.org/abs/2408.16770
Code: github.com/gpaschalidis...

🧵 10/10
CWGrasp
gpaschalidis.github.io
March 14, 2025 at 6:44 PM
🧩 Our code is modular - each model has its own repo.
You can easily integrate these into your code & build new research!

🧩 CGrasp: github.com/gpaschalidi...
🧩 CReach: github.com/gpaschalidi...
🧩 ReachingField: github.com/gpaschalidi...
🧩 CWGrasp: github.com/gpaschalidi...

🧵 9/10
GitHub - gpaschalidis/CWGrasp
Contribute to gpaschalidis/CWGrasp development by creating an account on GitHub.
github.com
March 14, 2025 at 6:36 PM
⚙️ CWGrasp:
👉 requires 500x less samples & runs 10x faster than SotA,
👉 produces grasps that are perceived as more realistic than SotA ~70% of the times,
👉 works well for objects placed at various "heights" from the floor,
👉 generates both right- & left-hand grasps.

🧵 8/10
March 14, 2025 at 6:36 PM
👉 We condition both CGrasp & CReach on the same direction.
👉 This produces a hand-only guiding grasp & a reaching body that are already mutually compatible!
🎯 Thus, we need to conduct a *small* refinement *only* for the body so that its fingers match the guiding hand!

🧵 7/10
March 14, 2025 at 6:35 PM
⚙️ CGrasp & CReach - generate a hand-only grasp & reaching body, respectively, with varied pose by sampling their latent space.
👉 Importantly, the palm & arm direction satisfy a desired (condition) 3D direction vector!
👉 This direction is sampled from ⚙️ ReachingField!

🧵 6/10
March 14, 2025 at 6:35 PM
⚙️ ReachingField - is a probabilistic 3D ray field encoding directions from which a body’s arm & hand likely reach an object without penetration.
👉 Objects near the ground are likely grasped from high above
👉 Objects high above the ground are likely grasped from below

🧵 5/10
March 14, 2025 at 6:35 PM
💡 Our key idea is to perform local-scene reasoning *early on*, so we generate an *already-compatible* guiding-hand & body, so *only* the body needs a *small* refinement to match the hand.

CWGrasp - consists of three novel models:
👉 ReachingField,
👉 CGrasp,
👉 CReach.

🧵 4/10
March 14, 2025 at 6:35 PM
🎯 We tackle this with CWGrasp in a divide-n-conquer way

This is inspired by FLEX [Tendulkar et al] that:
👉 generates a guiding hand-only grasp,
👉 generates many random bodies,
👉 post-processes the guiding hand to match the body, & the body to match the guiding hand.

🧵 3/10
March 14, 2025 at 6:35 PM
This is challenging 🫣 because:
👉 the body needs to plausibly reach the object,
👉 fingers need to dexterously grasp the object,
👉 hand pose and object pose need to look compatible with each other, and
👉 training datasets for 3D whole-body grasps are really scarce.

🧵 2/10
March 14, 2025 at 6:35 PM