iconLogo
Published:2025/10/23 11:02:39

LLM修復、データ選びがキモ💖超優秀な方法見つけた!

超要約:LLM(AI)の変な発言を直すのに、賢くデータ選ぶ方法見つけたよ!✨

✨ ギャル的キラキラポイント ✨ ● SAPS(サップス)って方法が最強!AIのセリフを良い感じに修正できるんだって! ● AIの頭(モデル)を修復する時、データ全部じゃなくて厳選(げんせん)するのが大事ってコト! ● IT業界(かっこいー!)で、安全で使えるAIを作るのに役立つんだって!

詳細解説いくよ~!

背景 AI、賢いけどたまに変なこと言うじゃん?💦 それを直す「モデル修復」って技術があるんだけど、何でもかんでも直せばいいってもんじゃないの!何を使うか、どう使うかが大事なのよ~🤔

続きは「らくらく論文」アプリで

An Empirical Study of Sample Selection Strategies for Large Language Model Repair

Xuran Li / Jingyi Wang

Large language models (LLMs) are increasingly deployed in real-world systems, yet they can produce toxic or biased outputs that undermine safety and trust. Post-hoc model repair provides a practical remedy, but the high cost of parameter updates motivates selective use of repair data. Despite extensive prior work on data selection for model training, it remains unclear which sampling criteria are most effective and efficient when applied specifically to behavioral repair of large generative models. Our study presents a systematic analysis of sample prioritization strategies for LLM repair. We evaluate five representative selection methods, including random sampling, K-Center, gradient-norm-based selection(GraNd), stratified coverage (CCS), and a Semantic-Aware Prioritized Sampling (SAPS) approach we proposed. Repair effectiveness and trade-offs are assessed through toxicity reduction, perplexity on WikiText-2 and LAMBADA, and three composite metrics: the Repair Proximity Score (RPS), the Overall Performance Score (OPS), and the Repair Efficiency Score (RES). Experimental results show that SAPS achieves the best balance between detoxification, utility preservation, and efficiency, delivering comparable or superior repair outcomes with substantially less data. Random sampling remains effective for large or robust models, while high-overhead methods such as CCS and GraNd provide limited benefit. The optimal data proportion depends on model scale and repair method, indicating that sample selection should be regarded as a tunable component of repair pipelines. Overall, these findings establish selection-based repair as an efficient and scalable paradigm for maintaining LLM reliability.

cs / cs.LG