最強APIテスト爆誕！LLMでカバレッジ爆上げ🚀

Published：2026/1/7 7:02:36

最強APIテスト爆誕！LLMでカバレッジ爆上げ🚀

超要約: LLM（AI）でAPIテストを賢くするMioHint！カバレッジ（テスト範囲）が劇的に上がるって話✨
ギャル的キラキラポイント✨
- ● APIテストの悩み💥「フィッテス高原」をLLMが解決！
- ● コード理解能力が高いLLMが、テストケースを爆速で生成💨
- ● クラウドサービス（ネットのサービス）の品質が爆上がりする予感💖
詳細解説
- 背景: クラウドサービスが増え、API（プログラム同士の会話）のテストは超重要！でも既存のテストはイマイチだったの😭
- 方法: LLMにAPIのコードを読ませて、テストケースを作ってもらうの！特に、テストしにくい部分を重点的にカバーするんだって！
- 結果: カバレッジがめっちゃ上がって、APIテストが効率的になったみたい🥳
- 意義（ここがヤバい♡ポイント）: サービスの品質が上がり、開発もスムーズに✨ 競争力もUPしちゃうかも！
リアルでの使いみちアイデア💡
- 💡 IT企業の開発チームで、APIテストをもっと楽に！
- 💡 プログラマーのあなたが、自分の作ったAPIを最強にテスト！

続きは「らくらく論文」アプリで

LLM-assisted Mutation for Whitebox API Testing

Jia Li / Jiacheng Shen / Yuxin Su / Michael R. Lyu

Cloud applications heavily rely on APIs to communicate with each other and exchange data. To ensure the reliability of cloud applications, cloud providers widely adopt API testing techniques. Unfortunately, existing API testing approaches are insufficient to reach strict conditions, a problem known as fitness plateaus, due to the lack of gradient provided by coverage metrics. To address this issue, we propose MioHint, a novel white-box API testing approach that leverages the code comprehension capabilities of Large Language Model (LLM) to boost API testing. The key challenge of LLM-based API testing lies in system-level testing, which emphasizes the dependencies between requests and targets across functions and files, thereby making the entire codebase the object of analysis. However, feeding the entire codebase to an LLM is impractical due to its limited context length and short memory. MioHint addresses this challenge by synergizing static analysis with LLMs. We retrieve relevant code with data-dependency analysis at the statement level, including def-use analysis for variables used in the target and function expansion for subfunctions called by the target. To evaluate the effectiveness of our method, we conducted experiments across 16 real-world REST API services. The findings reveal that MioHint achieves an average increase of 4.95% absolute in line coverage compared to the baseline, EvoMaster, alongside a remarkable factor of 67x improvement in mutation accuracy. Furthermore, our method successfully covers over 57% of hard-to-cover targets while in baseline the coverage is less than 10%.

cs / cs.SE

Arxivで見る