BaseCal：LLMの信頼性爆上げ作戦！✨（IT企業向け）

Published：2026/1/8 14:57:18

BaseCal：LLMの信頼性爆上げ作戦！✨（IT企業向け）

超要約: ポスト学習LLMのウソ見抜く「BaseCal」！教師なしで信頼性UPだよ💖
ギャル的キラキラポイント✨
- ● BaseCalは、LLMのウソをBase LLMでチェックするってこと！
- ● 論文読まなくても、BaseCalでITサービスが安心して使えるようになるかも！
- ● ラベルデータ（正解データ）いらないから、コスパも最強🌟
詳細解説
- 背景: LLMはスゴイけど、ウソつくこともあるじゃん？それがIT業界の悩み😢特に、学習後に微調整されたLLM(PoLLM)は、自信満々に間違ったこと言うから困る！
- 方法: BaseCalは、PoLLMの答えをBase LLMで「再評価(リエバリュエーション)」するの！Base LLMは賢いから、ウソかホントか見抜いてくれるってわけ！
- 結果: BaseCalを使えば、PoLLMの信頼度が上がる！ITサービスでLLMをもっと安心して使えるようになるってこと！ユーザーも嬉しいね🎵
- 意義（ここがヤバい♡ポイント）: 嘘つきLLM問題に終止符を！ITサービスが信頼されて、企業もユーザーもハッピーになれる！ビジネスチャンスも広がる予感💖
リアルでの使いみちアイデア💡
- 1. 検索エンジンの信頼度UP: 検索結果のウソを見抜いて、正確な情報だけ表示！みんなが安心して検索できるね🔍
- 2. チャットボットの進化: 嘘つきチャットボット撲滅！正しい情報を教えてくれるから、もっと頼れる相棒になる✨

続きは「らくらく論文」アプリで

BaseCal: Unsupervised Confidence Calibration via Base Model Signals

Hexiang Tan / Wanli Yang / Junwei Zhang / Xin Chen / Rui Tang / Du Su / Jingang Wang / Yuanzhuo Wang / Fei Sun / Xueqi Cheng

Reliable confidence is essential for trusting the outputs of LLMs, yet widely deployed post-trained LLMs (PoLLMs) typically compromise this trust with severe overconfidence. In contrast, we observe that their corresponding base LLMs often remain well-calibrated. This naturally motivates us to calibrate PoLLM confidence using the base LLM as a reference. This work proposes two ways to achieve this. A straightforward solution, BaseCal-ReEval, evaluates PoLLM's responses by feeding them into the base LLM to get average probabilities as confidence. While effective, this approach introduces additional inference overhead. To address this, we propose BaseCal-Proj, which trains a lightweight projection to map the final-layer hidden states of PoLLMs back to those of their base LLMs. These projected states are then processed by the base LLM's output layer to derive base-calibrated confidence for PoLLM's responses. Notably, BaseCal is an unsupervised, plug-and-play solution that operates without human labels or LLM modifications. Experiments across five datasets and three LLM families demonstrate the effectiveness of BaseCal, reducing Expected Calibration Error (ECE) by an average of 42.90\% compared to the best unsupervised baselines.

cs / cs.CL

Arxivで見る