超要約: 計算コストを抑えつつ、賢いAIを作る新技術!
✨ ギャル的キラキラポイント ✨ ● 計算コスト削減で、AIがもっと身近になるかも!💸 ● 専門知識に強いAIで、色んな分野が進化する予感!✨ ● 破滅的忘却を防ぐから、AIはずっと賢いまま💖
背景 最近のAI、スゴイけど超お金かかる💸!API代も高いし、専門知識もイマイチだったり…😭 そこで、賢いAIを安く、もっと色んな分野で使えるようにしたい!って研究が始まったんだよね💖
続きは「らくらく論文」アプリで
In recent years, Pretrained Large Models(PLMs) researchers proposed large-small model collaboration frameworks, leveraged easily trainable small models to assist large models, aim to(1) significantly reduce computational resource consumption while maintaining comparable accuracy, and (2) enhance large model performance in specialized domain tasks. However, this collaborative paradigm suffers from issues such as significant accuracy degradation, exacerbated catastrophic forgetting, and amplified hallucination problems induced by small model knowledge. To address these challenges, we propose a KAN-based Collaborative Model (KCM) as an improved approach to large-small model collaboration. The KAN utilized in KCM represents an alternative neural network architecture distinct from conventional MLPs. Compared to MLPs, KAN offers superior visualizability and interpretability while mitigating catastrophic forgetting. We deployed KCM in large-small model collaborative systems across three scenarios: language, vision, and vision-language cross-modal tasks. The experimental results demonstrate that, compared with pure large model approaches, the large-small model collaboration framework utilizing KCM as the collaborative model significantly reduces the number of large model inference calls while maintaining near-identical task accuracy, thereby substantially lowering computational resource consumption. Concurrently, the KAN-based small collaborative model markedly mitigates catastrophic forgetting, leading to significant accuracy improvements for long-tail data. The results reveal that KCM demonstrates superior performance across all metrics compared to MLP-based small collaborative models (MCM).