ヘルスケアAI、安全と便利を両立✨

Published：2025/12/3 19:30:07

ヘルスケアAI、安全と便利を両立✨

超要約: AIアシスタントを安全に使いやすくする研究だよ！

🌟 ギャル的キラキラポイント ● AIがヘルスケア（健康管理）で大活躍する未来が来るかも💖 ● 安全なAIを作るために、色んな工夫をしてるんだって！すごい✨ ● 患者さんとお医者さんの両方に嬉しいAIを目指してるって、神🥺

詳細解説 ● 背景ヘルスケアAIって、患者さんの役に立つ情報とか教えてくれるけど、間違ったこと言っちゃうと大変じゃん？😱 だから、安全でちゃんと役に立つAIを作る研究がされてるんだ！医療の質を上げたいってことね👍

● 方法 AIに「これは良い」とか「これはダメ」とかを繰り返し教えて、どんどん賢くするんだって！✨ 実世界のユーザーの声も取り入れて、もっと良いAIに育ててるみたい。まさに、育メン👨‍⚕️

続きは「らくらく論文」アプリで

Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment

Huy Nghiem / Swetasudha Panda / Devashish Khatwani / Huy V. Nguyen / Krishnaram Kenthapadi / Hal Daum\'e III

Large Language Models (LLMs) are increasingly used in healthcare, yet ensuring their safety and trustworthiness remains a barrier to deployment. Conversational medical assistants must avoid unsafe compliance without over-refusing benign queries. We present an iterative post-deployment alignment framework that applies Kahneman-Tversky Optimization (KTO) and Direct Preference Optimization (DPO) to refine models against domain-specific safety signals. Using the CARES-18K benchmark for adversarial robustness, we evaluate four LLMs (Llama-3B/8B, Meditron-8B, Mistral-7B) across multiple cycles. Our results show up to 42% improvement in safety-related metrics for harmful query detection, alongside interesting trade-offs against erroneous refusals, thereby exposing architecture-dependent calibration biases. We also perform ablation studies to identify when self-evaluation is reliable and when external or finetuned judges are necessary to maximize performance gains. Our findings underscore the importance of adopting best practices that balance patient safety, user trust, and clinical utility in the design of conversational medical assistants.

cs / cs.AI / cs.CL / cs.CY

Arxivで見る