Published：2025/12/16 4:10:55

LLMアプリ、危険がいっぱい！💥

タイトル & 超要約：LLMアプリ、境界線（きょうかいせん）が曖昧（あいまい）だとヤバい！セキュリティリスクを研究🧐

ギャル的キラキラポイント✨

● LLMアプリの「機能劣化（きのうれっか）」と「機能拡張（きのうかくちょう）」に注目👀 ● 「LLMApp-Eval」っていう、リスクを測（はか）るツールを開発🛠️ ● Web3とかの、お金💰に関わるアプリのセキュリティにも役立つかも！

詳細解説

背景

LLM（大規模言語モデル）を使ったアプリ、めっちゃ増えてるよねー！💖 便利だけど、セキュリティが心配じゃない？😱 特に、アプリの機能の境界線が曖昧だと、悪いことされちゃうリスクがあるんだって！例えば、不正アクセスとか、変な指示されちゃうとか…😭

方法

「機能劣化（きのうれっか）」と「機能拡張（きのうかくちょう）」っていう新しいリスクを定義🔎 Jailbreak（脱獄みたいなもの）だけじゃなくて、もっと色んなパターンを研究してるみたい🤔 LLMApp-Evalっていうフレームワークを使って、実際にアプリのリスクを評価してるんだって！

続きは「らくらく論文」アプリで

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang / Shibo Cui / Baojun Liu / Jingkai Yu / Min Zhang / Fan Shi / Han Zheng

LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional application development. While the increasing prevalence of LLM-powered applications provides users with unprecedented convenience, it also brings forth new security challenges. For such an emerging ecosystem, the security community lacks sufficient understanding of the LLM application ecosystem, especially regarding the capability boundaries of the applications themselves. In this paper, we systematically analyzed the new development paradigm and defined the concept of the LLM app capability space. We also uncovered potential new risks beyond jailbreak that arise from ambiguous capability boundaries in real-world scenarios, namely, capability downgrade and upgrade. To evaluate the impact of these risks, we designed and implemented an LLM app capability evaluation framework, LLMApp-Eval. First, we collected application metadata across 4 platforms and conducted a cross-platform ecosystem analysis. Then, we evaluated the risks for 199 popular applications among 4 platforms and 6 open-source LLMs. We identified that 178 (89.45%) potentially affected applications, which can perform tasks from more than 15 scenarios or be malicious. We even found 17 applications in our study that executed malicious tasks directly, without applying any adversarial rewriting. Furthermore, our experiments also reveal a positive correlation between the quality of prompt design and application robustness. We found that well-designed prompts enhance security, while poorly designed ones can facilitate abuse. We hope our work inspires the community to focus on the real-world risks of LLM applications and foster the development of a more robust LLM application ecosystem.

cs / cs.CR

Arxivで見る