超要約: 2D-VLMを使って、アノテーションなしで3Dセグメンテーションできちゃう魔法🪄
🌟 ギャル的キラキラポイント✨
● アノテーション(注釈)いらず!✨ ラベル付け、めんどくさいよね?😭 これならデータ集めが超楽ちん♪ ● オープンボキャブラリー!💖 テキストで「あの車!」とか指示できるから、色んなオブジェクトを認識できるの! ● 自動運転とかARとか、色んな分野で活躍!😍 未来の技術って感じ~!
詳細解説いくよ~!
続きは「らくらく論文」アプリで
This paper presents a novel 3D semantic segmentation method for large-scale point cloud data that does not require annotated 3D training data or paired RGB images. The proposed approach projects 3D point clouds onto 2D images using virtual cameras and performs semantic segmentation via a foundation 2D model guided by natural language prompts. 3D segmentation is achieved by aggregating predictions from multiple viewpoints through weighted voting. Our method outperforms existing training-free approaches and achieves segmentation accuracy comparable to supervised methods. Moreover, it supports open-vocabulary recognition, enabling users to detect objects using arbitrary text queries, thus overcoming the limitations of traditional supervised approaches.