DIFFCOT爆誕！LLMの思考力UP🚀

Published：2026/1/7 3:58:42

DIFFCOT爆誕！LLMの思考力UP🚀

超要約：LLM（大規模言語モデル）の推論を拡散モデルで改善し、精度爆上げ！

✨ ギャル的キラキラポイント ✨ ● 拡散モデル（ふか〜く考えるとこを、色々試して正解に近づけるイメージ）の力で、LLMの推論をパワーアップ！ ● 間違いを訂正しながら進むから、答えの精度がめっちゃ上がるんだって！賢すぎ🥺 ● いろんなサービスで使えるから、私たちの生活がもっと便利になるかも！

詳細解説 ● 背景 LLMって、賢いんだけど、ちょっとしたミスが全部に影響しちゃうことあるじゃん？😱 でもDIFFCOTは、拡散モデルを使って、その弱点を克服したんだ！

● 方法推論（考えること）のステップを、拡散モデルで段階的に修正していくの！ノイズ（雑音）を取り除いていくイメージ💅✨ こうすることで、精度の高い答えに近づけるんだって！

続きは「らくらく論文」アプリで

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Shidong Cao / Hongzhan Lin / Yuxuan Gu / Ziyang Luo / Jing Ma

Chain-of-Thought (CoT) reasoning improves multi-step mathematical problem solving in large language models but remains vulnerable to exposure bias and error accumulation, as early mistakes propagate irreversibly through autoregressive decoding. In this work, we propose DiffCoT, a diffusion-styled CoT framework that reformulates CoT reasoning as an iterative denoising process. DiffCoT integrates diffusion principles at the reasoning-step level via a sliding-window mechanism, enabling unified generation and retrospective correction of intermediate steps while preserving token-level autoregression. To maintain causal consistency, we further introduce a causal diffusion noise schedule that respects the temporal structure of reasoning chains. Extensive experiments on three multi-step CoT reasoning benchmarks across diverse model backbones demonstrate that DiffCoT consistently outperforms existing CoT preference optimization methods, yielding improved robustness and error-correction capability in CoT reasoning.

cs / cs.CL

Arxivで見る