RGB画像から3D再構成！手と物体を3Dで理解しちゃお💖（HOSt3R）

Published：2025/8/25 16:02:08

RGB画像から3D再構成！手と物体を3Dで理解しちゃお💖（HOSt3R）

超要約： RGB画像1枚から、手と物体の3D情報を、キーポイントなしでスゴイ精度で出す技術だよ！

✨ ギャル的キラキラポイント ✨ ● キーポイント（手の関節とか）探しが不要！色んな形に対応できるのが神✨ ● 手と物体が隠れてても、光の具合が悪くても大丈夫！だって最強だもん😎 ● VR/ARとかロボット🤖とか、色んな分野で使えそう！未来が楽しみすぎる～😍

詳細解説 ● 背景 AR/VRやロボットの世界で、手と物体を3Dで再現したいって需要が爆上がり中！でも、従来の技術じゃ、手の関節とか特定したり、光の具合で上手くいかなかったり…💦

● 方法 HOSt3R（エイチオーエスティー3アール）は、RGB画像から直接3D情報をゲットする斬新な手法💡 具体的には、Pointmap（点群マップ）を作って、カメラの位置関係を計算し、3D形状を再構成するの！

続きは「らくらく論文」アプリで

HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images

Anilkumar Swamy / Vincent Leroy / Philippe Weinzaepfel / Jean-S\'ebastien Franco / Gr\'egory Rogez

Hand-object 3D reconstruction has become increasingly important for applications in human-robot interaction and immersive AR/VR experiences. A common approach for object-agnostic hand-object reconstruction from RGB sequences involves a two-stage pipeline: hand-object 3D tracking followed by multi-view 3D reconstruction. However, existing methods rely on keypoint detection techniques, such as Structure from Motion (SfM) and hand-keypoint optimization, which struggle with diverse object geometries, weak textures, and mutual hand-object occlusions, limiting scalability and generalization. As a key enabler to generic and seamless, non-intrusive applicability, we propose in this work a robust, keypoint detector-free approach to estimating hand-object 3D transformations from monocular motion video/images. We further integrate this with a multi-view reconstruction pipeline to accurately recover hand-object 3D shape. Our method, named HOSt3R, is unconstrained, does not rely on pre-scanned object templates or camera intrinsics, and reaches state-of-the-art performance for the tasks of object-agnostic hand-object 3D transformation and shape estimation on the SHOWMe benchmark. We also experiment on sequences from the HO3D dataset, demonstrating generalization to unseen object categories.

cs / cs.CV / cs.AI / cs.HC / cs.LG / cs.RO

Arxivで見る