iconLogo
Published:2026/1/8 15:35:01

OptiSetでRAG爆上げ!✨ 超効率化&高精度サービス爆誕!

超要約: RAGの検索結果をOptiSetで賢く整理!情報整理が上手くなって、サービスの質がグーンと上がるってこと💖

🌟 ギャル的キラキラポイント✨ ● RAGの検索結果を、もっと賢く使えるようにする技術なの!賢い検索エンジンみたいなイメージかな? ● 情報セット(パッセージの集まり)をOptiSetが最適化!ムダを省いて効率アップだよ🎵 ● 色んな情報源から集めた情報を、さらに良い感じにまとめてくれるって、超便利じゃない?

詳細解説いくよ~!

背景 LLM (大規模言語モデル) はすごいけど、情報源が限られると困っちゃう💦 そこで、外部の情報を引っ張ってくるRAG(検索拡張生成)って技術が注目されてるの!でも、検索結果がゴチャゴチャしてて、イマイチ使いにくい…って課題があったんだよね😢

続きは「らくらく論文」アプリで

OptiSet: Unified Optimizing Set Selection and Ranking for Retrieval-Augmented Generation

Yi Jiang / Sendong Zhao / Jianbo Li / Bairui Hu / Yanrui Du / Haochun Wang / Bing Qin

Retrieval-Augmented Generation (RAG) improves generation quality by incorporating evidence retrieved from large external corpora. However, most existing methods rely on statically selecting top-k passages based on individual relevance, which fails to exploit combinatorial gains among passages and often introduces substantial redundancy. To address this limitation, we propose OptiSet, a set-centric framework that unifies set selection and set-level ranking for RAG. OptiSet adopts an "Expand-then-Refine" paradigm: it first expands a query into multiple perspectives to enable a diverse candidate pool and then refines the candidate pool via re-selection to form a compact evidence set. We then devise a self-synthesis strategy without strong LLM supervision to derive preference labels from the set conditional utility changes of the generator, thereby identifying complementary and redundant evidence. Finally, we introduce a set-list wise training strategy that jointly optimizes set selection and set-level ranking, enabling the model to favor compact, high-gain evidence sets. Extensive experiments demonstrate that OptiSet improves performance on complex combinatorial problems and makes generation more efficient. The source code is publicly available.

cs / cs.AI