iconLogo
Published:2025/10/23 8:53:37

最強LLM! 信頼度爆上げ術✨

LLM(大規模言語モデル)の信頼度を上げる研究だよ! 専門知識なしでも余裕で理解できるから、安心してね♪

超要約:LLMの信頼性UP!教師なし学習で爆速キャリブレーション🚀

ギャル的キラキラポイント✨

● 教師なし学習(ラベルなしデータで学習)で、手間なくLLMの信頼性を高める方法を発見! ● 既存のLLMの予測と、新しく学習させたLLMの予測のズレに着目💡 ● 医療とか金融とか、マジで信頼性が大事な分野でのLLM活用を後押し!

詳細解説

● 背景 LLMってすごいけど、学習させると「自信過剰」になっちゃう問題があったの😱 どんな答えにも「自信満々!」みたいな。だから、信頼度をちゃんと調整する技術が必要だったんだよね。

続きは「らくらく論文」アプリで

Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator

Beier Luo / Shuoyuan Wang / Sharon Li / Hongxin Wei

Post-training of large language models is essential for adapting pre-trained language models (PLMs) to align with human preferences and downstream tasks. While PLMs typically exhibit well-calibrated confidence, post-trained language models (PoLMs) often suffer from over-confidence, assigning high confidence to both correct and incorrect outputs, which can undermine reliability in critical applications. A major obstacle in calibrating PoLMs is the scarcity of labeled data for individual downstream tasks. To address this, we propose Disagreement-Aware Confidence Alignment (DACA), a novel unsupervised method to optimize the parameters (e.g., temperature $\tau$) in post-hoc confidence calibration. Our method is motivated by the under-confidence issue caused by prediction disagreement between the PLM and PoLM while aligning their confidence via temperature scaling. Theoretically, the PLM's confidence underestimates PoLM's prediction accuracy on disagreement examples, causing a larger $\tau$ and producing under-confident predictions. DACA mitigates this by selectively using only agreement examples for calibration, effectively decoupling the influence of disagreement. In this manner, our method avoids an overly large $\tau$ in temperature scaling caused by disagreement examples, improving calibration performance. Extensive experiments demonstrate the effectiveness of our method, improving the average ECE of open-sourced and API-based LLMs (e.g. GPT-4o) by up to 15.08$\%$ on common benchmarks.

cs / cs.LG / cs.AI