超要約:LLM(すごいAI)とSLM(ちょっと控えめAI)を仲良くさせて、質問に賢く答えるシステムを安く早く作る方法だよ!✨
🌟 ギャル的キラキラポイント✨ ● LLMとSLMの使い分けが天才的!賢いLLMを必要な時にだけ使うから、コスパ最強💰 ● SLMのウソ(ハルシネーション)をすぐに見破る!見破る秘密兵器「AttenHScore」がすごい😎 ● チャットボットとか、色んなサービスがもっと賢くなるかも!ビジネスチャンス到来の予感💖
詳細解説 ● 背景 LLMは頭が良いけど、お金がかかる💸 SLMは安上がりだけど、たまに嘘をつく💦 だから、賢く使い分けたら最強じゃん? って研究だよ!
● 方法 SLMが答える時に、怪しいとこがないか「AttenHScore」でチェック🔍 怪しかったらLLM先生に助けてもらうの!RAG(情報検索)も駆使して、SLMの精度を上げる作戦も✨
続きは「らくらく論文」アプリで
The collaborative paradigm of large and small language models (LMs) effectively balances performance and cost, yet its pivotal challenge lies in precisely pinpointing the moment of invocation when hallucinations arise in small LMs. Previous optimization efforts primarily focused on post-processing techniques, which were separate from the reasoning process of LMs, resulting in high computational costs and limited effectiveness. In this paper, we propose a practical invocation evaluation metric called AttenHScore, which calculates the accumulation and propagation of hallucinations during the generation process of small LMs, continuously amplifying potential reasoning errors. By dynamically adjusting the detection threshold, we achieve more accurate real-time invocation of large LMs. Additionally, considering the limited reasoning capacity of small LMs, we leverage uncertainty-aware knowledge reorganization to assist them better capture critical information from different text chunks. Extensive experiments reveal that our AttenHScore outperforms most baselines in enhancing real-time hallucination detection capabilities across multiple QA datasets, especially when addressing complex queries. Moreover, our strategies eliminate the need for additional model training and display flexibility in adapting to various transformer-based LMs.