iconLogo
Published:2025/8/22 22:40:58

パズルAI最強ベンチマーク「PuzzleJAX」爆誕🎉

超要約:パズルゲームAIの性能を爆速⚡評価できる最強ツール! 新規ビジネスも夢じゃない💖

✨ ギャル的キラキラポイント ✨ ● パズルゲームAIの"実力"を測る、すごい"物差し"が登場したってこと!📏✨ ● GPU(グラフィックボード)のおかげで、AIの勉強が超高速💨 効率的! ● パズルゲームを色々作れるから、AIの成長を色んな角度から見れる👀💖

🌟 詳細解説 🌟 ● 背景 AI(人工知能)の"頭脳"を試すゲーム、特にパズルゲームがアツい🔥 いろんなパズルでAIのレベルをチェックする仕組みが欲しかったんだよね!

● 方法 PuzzleScriptって言語を使って作られたパズルゲームを、JAXっていうので動かせるようにした!JAXはGPU使えるから、AIの学習がめっちゃ速くなる🚀

続きは「らくらく論文」アプリで

PuzzleJAX: A Benchmark for Reasoning and Learning

Sam Earle / Graham Todd / Yuchen Li / Ahmed Khalifa / Muhammad Umair Nasir / Zehua Jiang / Andrzej Banburski-Fahey / Julian Togelius

We introduce PuzzleJAX, a GPU-accelerated puzzle game engine and description language designed to support rapid benchmarking of tree search, reinforcement learning, and LLM reasoning abilities. Unlike existing GPU-accelerated learning environments that provide hard-coded implementations of fixed sets of games, PuzzleJAX allows dynamic compilation of any game expressible in its domain-specific language (DSL). This DSL follows PuzzleScript, which is a popular and accessible online game engine for designing puzzle games. In this paper, we validate in PuzzleJAX several hundred of the thousands of games designed in PuzzleScript by both professional designers and casual creators since its release in 2013, thereby demonstrating PuzzleJAX's coverage of an expansive, expressive, and human-relevant space of tasks. By analyzing the performance of search, learning, and language models on these games, we show that PuzzleJAX can naturally express tasks that are both simple and intuitive to understand, yet often deeply challenging to master, requiring a combination of control, planning, and high-level insight.

cs / cs.AI / cs.LG