超要約:コードレビューをAIで解析!アーキテクチャ侵食(設計のズレ)を早期発見するって話💖
● コードレビューコメント📝 をAI(機械学習&深層学習モデル)で解析するんだって!賢すぎ! ● アーキテクチャ侵食を早期に発見 👀 して、ソフトウェアの品質UPを目指すのね! ● 既存のツールより、コードレビューの"言葉"に着目してるのが斬新💎
背景 ソフトウェアの設計図(アーキテクチャ)って大事じゃん?💻 でも、開発が進むと設計と実装がズレる「アーキテクチャ侵食」が起きがち。それを放置すると、システムの質が悪くなるから、早期に対策したい!🤔
方法 コードレビューのコメント📝 には、アーキテクチャ違反のヒントがいっぱい! そこで、AIを使って、そのコメントから違反の兆候を見つけ出すモデルを作ったんだって! いろんなAIモデルを試して、一番良い方法を探したみたい🌟
続きは「らくらく論文」アプリで
Architecture erosion has a detrimental effect on maintenance and evolution, as the implementation deviates from the intended architecture. Detecting symptoms of erosion, particularly architectural violations, at an early stage is crucial. This paper explores the automated identification of violation symptoms from developer discussions in code reviews. We developed 15 machine learning-based and 4 deep learning-based classifiers using three pre-trained word embeddings, and evaluated them on code review comments from four large open-source projects (OpenStack Nova/Neutron and Qt Base/Creator). To validate practical value, we conducted surveys and semi-structured interviews with developers involved in these discussions. We further compared traditional ML/DL classifiers with Large Language Models (LLMs) such as GPT-4o, Qwen-2.5, and DeepSeek-R1. Results show that SVM with word2vec achieved the best ML/DL performance with an F1-score of 0.779, while fastText embeddings also yielded strong results. Ensemble voting strategies enhanced traditional classifiers, and 200-dimensional embeddings generally outperformed 100/300-dimensional ones. LLM-based classifiers consistently surpassed ML/DL models, with GPT-4o achieving the best F1-score of 0.851, though ensembles added no further benefits. Overall, our study provides an automated approach to identify architecture violation symptoms, offers systematic comparisons of ML/DL and LLM methods, and delivers practitioner insights, contributing to sustainable architectural conformance in software systems.