タイトル & 超要約
RDSでLLMの不安を数値化!IT業界をアゲる新技術☆
ギャル的キラキラポイント
● LLM(大規模言語モデル)のウソ見抜く「RDS」爆誕!自信過剰な回答もアラート📢 ● 難しい計算なし!シンプルなのに高性能って、マジ神じゃん?✨ ● IT業界のサービスが、もっと安心して使えるようになるってコト🫶
詳細解説
続きは「らくらく論文」アプリで
Detecting when large language models (LLMs) are uncertain is critical for building reliable systems, yet existing methods are overly complicated, relying on brittle semantic clustering or internal states. We introduce \textbf{Radial Dispersion Score (RDS)}, a simple, parameter-free, fully model-agnostic uncertainty metric that measures the radial dispersion of sampled generations in embedding space. A lightweight probability-weighted variant further incorporates the model's own token probabilities when available, outperforming different nine strong baselines. Moroever, RDS naturally extends to per-sample scoring, enabling applications such as best-of-$N$ selection and confidence-based filtering. Across four challenging free-form QA datasets and multiple LLMs, our metrics achieve state-of-the-art hallucination detection and answer selection performance, while remaining robust and scalable with respect to sample size and embedding choice.