最強エージェント爆誕！継続学習の壁をブチ壊すAgent-Dice☆

Published：2026/1/7 6:43:50

最強エージェント爆誕！継続学習の壁をブチ壊すAgent-Dice☆

タイトル & 超要約: エージェント最強格上げ！Agent-Diceで学習爆速化🚀
ギャル的キラキラポイント✨ ● 過去の知識を忘れちゃう「Catastrophic Forgetting」を解決😎 ● 「タスク固有」と「共有」の知識を区別して効率UP⤴️ ● 幾何学と曲率（きょくりつ）で、学習の安定性と柔軟性を両立✨
詳細解説
- 背景: LLM（大規模言語モデル）エージェントって、色んなことできるようになってスゴイじゃん？でも新しいこと覚えさせると、今まで覚えたこと忘れちゃう問題があったの！それを解決したいって研究だよ💕
- 方法: Agent-Diceは、新しい情報と、みんなで使える情報を区別するの！幾何学的な方法と、曲がり具合（曲率）で、学習を上手くコントロールしてるらしい🌟
- 結果: 今までのやり方より、Agent-Diceの方が断然優秀だったって！ GUIエージェントとかツール利用エージェントで試したら、パフォーマンスも良かったみたい🎉
- 意義（ここがヤバい♡ポイント）: これで、エージェントは色んなタスクをどんどん覚えられるようになる！業務の自動化とか、新しいサービス作ったり、IT業界がめっちゃ進化するかも😍
リアルでの使いみちアイデア💡
- AIチャットボットが、もっと賢くなって、どんな質問にも答えられるようになるかも！
- ロボットが、色んな場所で、色んな作業をこなせるようになって、生活が便利になるかもね💕

続きは「らくらく論文」アプリで

Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning

Zheng Wu / Xingyu Lou / Xinbei Ma / Yansi Li / Weiwen Liu / Weinan Zhang / Jun Wang / Zhuosheng Zhang

Large Language Model (LLM)-based agents significantly extend the utility of LLMs by interacting with dynamic environments. However, enabling agents to continually learn new tasks without catastrophic forgetting remains a critical challenge, known as the stability-plasticity dilemma. In this work, we argue that this dilemma fundamentally arises from the failure to explicitly distinguish between common knowledge shared across tasks and conflicting knowledge introduced by task-specific interference. To address this, we propose Agent-Dice, a parameter fusion framework based on directional consensus evaluation. Concretely, Agent-Dice disentangles knowledge updates through a two-stage process: geometric consensus filtering to prune conflicting gradients, and curvature-based importance weighting to amplify shared semantics. We provide a rigorous theoretical analysis that establishes the validity of the proposed fusion scheme and offers insight into the origins of the stability-plasticity dilemma. Extensive experiments on GUI agents and tool-use agent domains demonstrate that Agent-Dice exhibits outstanding continual learning performance with minimal computational overhead and parameter updates.

cs / cs.CL

Arxivで見る