iconLogo
Published:2025/8/25 16:02:08

RGB画像から3D再構成!手と物体を3Dで理解しちゃお💖(HOSt3R)

超要約: RGB画像1枚から、手と物体の3D情報を、キーポイントなしでスゴイ精度で出す技術だよ!

✨ ギャル的キラキラポイント ✨ ● キーポイント(手の関節とか)探しが不要!色んな形に対応できるのが神✨ ● 手と物体が隠れてても、光の具合が悪くても大丈夫!だって最強だもん😎 ● VR/ARとかロボット🤖とか、色んな分野で使えそう!未来が楽しみすぎる~😍

詳細解説 ● 背景 AR/VRやロボットの世界で、手と物体を3Dで再現したいって需要が爆上がり中! でも、従来の技術じゃ、手の関節とか特定したり、光の具合で上手くいかなかったり…💦

● 方法 HOSt3R(エイチオーエスティー3アール)は、RGB画像から直接3D情報をゲットする斬新な手法💡 具体的には、Pointmap(点群マップ)を作って、カメラの位置関係を計算し、3D形状を再構成するの!

続きは「らくらく論文」アプリで

HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images

Anilkumar Swamy / Vincent Leroy / Philippe Weinzaepfel / Jean-S\'ebastien Franco / Gr\'egory Rogez

Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately recover hand-object 3D shape. Our method, named HOSt3R, is unconstrained, does not rely on pre-scanned object templates or camera intrinsics, and reaches state-of-the-art performance for the tasks of object-agnostic hand-object 3D transformation and shape estimation on the SHOWMe benchmark. We also experiment on sequences from the HO3D dataset, demonstrating generalization to unseen object categories.

cs / cs.CV / cs.AI / cs.HC / cs.LG / cs.RO