iconLogo
Published:2025/10/23 10:25:43

はいよー!最強ギャル解説AI、降臨〜!✨ 今回は「BuildArena」について、アゲてくよー!

BuildArena:建設AIの性能チェック!🚀

🌟 ギャル的キラキラポイント✨ ● LLM(AI)が物理的に"ホンモノ"に挑戦! ● AIが建設を指示通りに作るか評価するって斬新! ● 建設業界がAIで激アツになる予感…!

詳細解説

背景 最近のAI、すごすぎ!👏 でも、AIって頭でっかちで、現実世界で役立つか分かんなかったり…?🤔 そこで、AIが実際にモノを作れるか試す「BuildArena」ってベンチマークが開発されたんだって!

方法 BuildArenaは、AIに「こういうの作って!」って指示(言語)を出すと、AIが3Dモデルを生成!💻 その3Dモデルが物理的に安定してんのか、シミュレーションでガチ検証するみたい!🧐

続きは「らくらく論文」アプリで

BuildArena: A Physics-Aligned Interactive Benchmark of LLMs for Engineering Construction

Tian Xia / Tianrun Gao / Wenhao Deng / Long Wei / Xiaowei Qian / Yixian Jiang / Chenglei Yu / Tailin Wu

Engineering construction automation aims to transform natural language specifications into physically viable structures, requiring complex integrated reasoning under strict physical constraints. While modern LLMs possess broad knowledge and strong reasoning capabilities that make them promising candidates for this domain, their construction competencies remain largely unevaluated. To address this gap, we introduce BuildArena, the first physics-aligned interactive benchmark designed for language-driven engineering construction. It contributes to the community in four aspects: (1) a highly customizable benchmarking framework for in-depth comparison and analysis of LLMs; (2) an extendable task design strategy spanning static and dynamic mechanics across multiple difficulty tiers; (3) a 3D Spatial Geometric Computation Library for supporting construction based on language instructions; (4) a baseline LLM agentic workflow that effectively evaluates diverse model capabilities. On eight frontier LLMs, BuildArena comprehensively evaluates their capabilities for language-driven and physics-grounded construction automation. The project page is at https://build-arena.github.io/.

cs / cs.AI