4Dインタラクション爆誕！💖（ゼロショで✨）

Published：2025/12/16 5:10:19

4Dインタラクション爆誕！💖（ゼロショで✨）

超要約： テキストから4D（時間も！）で人間とモノの動きを生成するスゴ技！ゼロから作れるのがミソだよ☆

ギャル的キラキラポイント✨

● データなしでOK！: ゼロから作れちゃうのがスゴくない？✨ 大量データ集めなくて済むの、神！ ● 動きがリアル！: 人間の動きもオブジェクトとの絡みも、めっちゃ自然に再現できちゃうんだって！ ● ビジネスチャンス爆増！: メタバースとか広告とか、色々応用できるから、めちゃくちゃアツい🔥

詳細解説

続きは「らくらく論文」アプリで

AnchorHOI: Zero-shot Generation of 4D Human-Object Interaction via Anchor-based Prior Distillation

Sisi Dai / Kai Xu

Despite significant progress in text-driven 4D human-object interaction (HOI) generation with supervised methods, the scalability remains limited by the scarcity of large-scale 4D HOI datasets. To overcome this, recent approaches attempt zero-shot 4D HOI generation with pre-trained image diffusion models. However, interaction cues are minimally distilled during the generation process, restricting their applicability across diverse scenarios. In this paper, we propose AnchorHOI, a novel framework that thoroughly exploits hybrid priors by incorporating video diffusion models beyond image diffusion models, advancing 4D HOI generation. Nevertheless, directly optimizing high-dimensional 4D HOI with such priors remains challenging, particularly for human pose and compositional motion. To address this challenge, AnchorHOI introduces an anchor-based prior distillation strategy, which constructs interaction-aware anchors and then leverages them to guide generation in a tractable two-step process. Specifically, two tailored anchors are designed for 4D HOI generation: anchor Neural Radiance Fields (NeRFs) for expressive interaction composition, and anchor keypoints for realistic motion synthesis. Extensive experiments demonstrate that AnchorHOI outperforms previous methods with superior diversity and generalization.

cs / cs.CV

Arxivで見る