最新RAG！動的メモリで検索爆速💨✨

Published：2026/1/4 21:51:41

最新RAG！動的メモリで検索爆速💨✨

超要約: RAGの検索をもっと賢くする、動くメモリシステムの話だよ！
ギャル的キラキラポイント✨
- ● 人間の記憶みたいに、大事な情報は覚えて、そうじゃないのは忘れちゃうってのがスゴくない？😳
- ● 古い情報に悩まされずに、いつも最新の情報で検索できるって最高じゃん？💖
- ● IT企業の課題を解決して、新しいビジネスのチャンスも広げちゃうかも！🤩
詳細解説
- 背景: RAG (検索拡張生成) って、LLM (大規模言語モデル) が外部のデータを使って賢くなる技術のこと💡 でも、データが古くなると困るよね💦
- 方法: 人間の記憶みたいに、大事な情報は「記憶」、そうじゃないのは「忘却」する機能を追加した「Adaptive RAG Memory (ARM)」を作るよ！🤩
- 結果: 検索の精度が上がったり、メモリの容量を節約できたり、良いことづくし💖 IT業界で使える場面がいっぱいありそうじゃん？
- 意義: これを使えば、情報が常にアップデートされて、ユーザーは最新の情報をゲットできるってワケ😎✨ 検索の質が爆上がりするから、業務効率もアップするかもね！
リアルでの使いみちアイデア💡
- FAQチャットボットに導入して、最新情報で的確な回答をできるようにする🎉
- 社内情報検索システムで、必要な情報にサクッとアクセスできるようにする💻

続きは「らくらく論文」アプリで

A Dynamic Retrieval-Augmented Generation System with Selective Memory and Remembrance

Okan Bursa

We introduce \emph{Adaptive RAG Memory} (ARM), a retrieval-augmented generation (RAG) framework that replaces a static vector index with a \emph{dynamic} memory substrate governed by selective remembrance and decay. Frequently retrieved items are consolidated and protected from forgetting, while rarely used items gradually decay, inspired by cognitive consolidation and forgetting principles. On a lightweight retrieval benchmark, ARM reaches near state-of-the-art performance (e.g., NDCG@5 $\approx$ 0.940, Recall@5 $=1.000$) with only $\sim$22M parameters in the embedding layer, achieving the best efficiency among ultra-efficient models ($<$25M parameters). In addition, we compare static vs. dynamic RAG combinations across Llama 3.1 and GPT-4o. Llama 3.1 with static RAG achieves the highest key-term coverage (67.2\%) at moderate latency, while GPT-4o with a dynamic selective retrieval policy attains the fastest responses (8.2s on average) with competitive coverage (58.7\%). We further present an engineering optimization of the DynamicRAG implementation, making embedding weights configurable, adjustable at runtime, and robust to invalid settings. ARM yields competitive accuracy, self-regularizing memory growth, and interpretable retention dynamics without retraining the generator\color{black} and provides practical trade-off between quality, latency and memory efficiency for production and research RAG system.

cs / cs.IR / cs.AI

Arxivで見る