超要約: ベトナム語のPET/CT画像レポートをAIが作れるように!医療IT、マジ卍!
✨ ギャル的キラキラポイント ✨ ● ベトナム語でもPET/CTレポートが爆速(ばくはや)生成可能になるって、すごい!🌟 ● 医療画像(いりょうがぞう)のAI、英語だけじゃなくて色んな言語に対応(たいおう)する時代が来るって、エモくない?🥺 ● 医療IT業界(いりょうITぎょうかい)に革命(かくめい)を起こすポテンシャル、感じちゃう!🔥
詳細解説 背景 既存(きぞん)のAIは、英語のCTとかMRI画像(がぞう)の解析(かいせき)が得意だったけど、PET/CTとかベトナム語はちょいニガテだったの。でも、この研究で、その壁(かべ)をぶち壊すことに成功したんだって!🥳
方法 ベトナム語のPET/CT画像とレポートのセットを作って、AIにガッツリ学習させたみたい!画像とテキストを組み合わせることで、より賢く、正確(せいかく)なレポートが作れるようになったんだってさ!✍️
続きは「らくらく論文」アプリで
Vision-Language Foundation Models (VLMs), trained on large-scale multimodal datasets, have driven significant advances in Artificial Intelligence (AI) by enabling rich cross-modal reasoning. Despite their success in general domains, applying these models to medical imaging remains challenging due to the limited availability of diverse imaging modalities and multilingual clinical data. Most existing medical VLMs are trained on a subset of imaging modalities and focus primarily on high-resource languages, thus limiting their generalizability and clinical utility. To address these limitations, we introduce a novel Vietnamese-language multimodal medical dataset consisting of 2,757 whole-body PET/CT volumes from independent patients and their corresponding full-length clinical reports. This dataset is designed to fill two pressing gaps in medical AI development: (1) the lack of PET/CT imaging data in existing VLMs training corpora, which hinders the development of models capable of handling functional imaging tasks; and (2) the underrepresentation of low-resource languages, particularly the Vietnamese language, in medical vision-language research. To the best of our knowledge, this is the first dataset to provide comprehensive PET/CT-report pairs in Vietnamese. We further introduce a training framework to enhance VLMs' learning, including data augmentation and expert-validated test sets. We conduct comprehensive experiments benchmarking state-of-the-art VLMs on downstream tasks. The experimental results show that incorporating our dataset significantly improves the performance of existing VLMs. We believe this dataset and benchmark will serve as a pivotal step in advancing the development of more robust VLMs for medical imaging, especially for low-resource languages and clinical use in Vietnamese healthcare. The source code is available at https://github.com/AIoT-Lab-BKAI/ViPET-ReportGen.