超要約: ウェブのチャットボット、悪意のある命令で乗っ取られる危険性😱 セキュリティ対策しよ!
🌟 ギャル的キラキラポイント✨ ● チャットボット(AIがおしゃべりするやつ)のセキュリティって、意外と穴だらけなの!😭 ● サードパーティ製のプラグイン(追加機能)が、攻撃の入り口になるみたい😳 ● ビジネスチャンスもいっぱい!セキュリティ診断とか、プラグイン開発とか、アツい🔥
詳細解説 ● 背景 AIチャットボット、ウェブサイトにめっちゃ増えたよね!🎉 でも、悪意のある人が命令(プロンプトインジェクション)を仕込むと、変なこと言っちゃう可能性があるんだって!😱 サードパーティ製のプラグインが、その危険性を高めるみたい。
● 方法 17種類のプラグインを調べて、1万以上のサイトを調査した結果、チャットボットの脆弱性(ぜいじゃくせい)を特定したみたい。具体的には、会話履歴を改ざんされたり、変な情報が連携(れんけい)されたりするらしい!
続きは「らくらく論文」アプリで
Prompt injection attacks pose a critical threat to large language models (LLMs), with prior work focusing on cutting-edge LLM applications like personal copilots. In contrast, simpler LLM applications, such as customer service chatbots, are widespread on the web, yet their security posture and exposure to such attacks remain poorly understood. These applications often rely on third-party chatbot plugins that act as intermediaries to commercial LLM APIs, offering non-expert website builders intuitive ways to customize chatbot behaviors. To bridge this gap, we present the first large-scale study of 17 third-party chatbot plugins used by over 10,000 public websites, uncovering previously unknown prompt injection risks in practice. First, 8 of these plugins (used by 8,000 websites) fail to enforce the integrity of the conversation history transmitted in network requests between the website visitor and the chatbot. This oversight amplifies the impact of direct prompt injection attacks by allowing adversaries to forge conversation histories (including fake system messages), boosting their ability to elicit unintended behavior (e.g., code generation) by 3 to 8x. Second, 15 plugins offer tools, such as web-scraping, to enrich the chatbot's context with website-specific content. However, these tools do not distinguish the website's trusted content (e.g., product descriptions) from untrusted, third-party content (e.g., customer reviews), introducing a risk of indirect prompt injection. Notably, we found that ~13% of e-commerce websites have already exposed their chatbots to third-party content. We systematically evaluate both vulnerabilities through controlled experiments grounded in real-world observations, focusing on factors such as system prompt design and the underlying LLM. Our findings show that many plugins adopt insecure practices that undermine the built-in LLM safeguards.