コード生成を爆上げ🚀RPM-MCTSって何者！？

Published：2025/12/17 13:09:20

コード生成を爆上げ🚀RPM-MCTSって何者！？

超要約: コード生成を賢くする！知識検索とMCTSの最強タッグ✨

✨ ギャル的キラキラポイント ✨ ● コード生成の精度がUP⤴️ 賢くコード作れるって神！ ● 計算コスト削減💰 エラーを早く見つけて修正するから！ ● IT業界がアツい🔥 自動化とかAI開発に役立つよ！

詳細解説

背景 LLM (大規模言語モデル) がコード生成で大活躍中💻 でも、途中のステップの評価が難しい問題があったの。間違ったコード作っちゃうと、計算コストも増えちゃうじゃん？そこで、MCTS (モンテカルロ木探索) っていう、効率よく探索する手法を使うことにしたんだって！

続きは「らくらく論文」アプリで

RPM-MCTS: Knowledge-Retrieval as Process Reward Model with Monte Carlo Tree Search for Code Generation

Yuanyuan Lin / Xiangyu Ouyang / Teng Zhang / Kaixin Sui

Tree search-based methods have made significant progress in enhancing the code generation capabilities of large language models. However, due to the difficulty in effectively evaluating intermediate algorithmic steps and the inability to locate and timely correct erroneous steps, these methods often generate incorrect code and incur increased computational costs. To tackle these problems, we propose RPM-MCTS, an effective method that utilizes Knowledge-Retrieval as Process Reward Model based on Monte Carlo Tree Search to evaluate intermediate algorithmic steps. By utilizing knowledge base retrieval, RPM-MCTS avoids the complex training of process reward models. During the expansion phase, similarity filtering is employed to remove redundant nodes, ensuring diversity in reasoning paths. Furthermore, our method utilizes sandbox execution feedback to locate erroneous algorithmic steps during generation, enabling timely and targeted corrections. Extensive experiments on four public code generation benchmarks demonstrate that RPM-MCTS outperforms current state-of-the-art methods while achieving an approximately 15% reduction in token consumption. Furthermore, full fine-tuning of the base model using the data constructed by RPM-MCTS significantly enhances its code capabilities.

cs / cs.AI

Arxivで見る