超要約: 行動の質をAIが評価!改善点も教えてくれるって、すごくない?😍
✨ ギャル的キラキラポイント ✨ ● フィットネスとか武道の動きをAIが分析してくれるんだって!💖 ● 「なんで悪いのか」を説明してくれるから、マジで分かりやすい!💡 ● AR/VRとかヘルスケアにも使えるみたい!未来すぎ💕
詳細解説いくよ~!
背景 動画を見て「動きがいいか悪いか」をAIが見抜く研究だよ!今までは「何してるか」は分かっても「質」は分からなかったの。でも、フィットネスとか武道とかの動きを評価できたら、もっと色んな事に役立つよね!💪
続きは「らくらく論文」アプリで
Evaluating whether human action is standard or not and providing reasonable feedback to improve action standardization is very crucial but challenging in real-world scenarios. However, current video understanding methods are mainly concerned with what and where the action is, which is unable to meet the requirements. Meanwhile, most of the existing datasets lack the labels indicating the degree of action standardization, and the action quality assessment datasets lack explainability and detailed feedback. Therefore, we define a new Human Action Form Assessment (AFA) task, and introduce a new diverse dataset CoT-AFA, which contains a large scale of fitness and martial arts videos with multi-level annotations for comprehensive video analysis. We enrich the CoT-AFA dataset with a novel Chain-of-Thought explanation paradigm. Instead of offering isolated feedback, our explanations provide a complete reasoning process--from identifying an action step to analyzing its outcome and proposing a concrete solution. Furthermore, we propose a framework named Explainable Fitness Assessor, which can not only judge an action but also explain why and provide a solution. This framework employs two parallel processing streams and a dynamic gating mechanism to fuse visual and semantic information, thereby boosting its analytical capabilities. The experimental results demonstrate that our method has achieved improvements in explanation generation (e.g., +16.0% in CIDEr), action classification (+2.7% in accuracy) and quality assessment (+2.1% in accuracy), revealing great potential of CoT-AFA for future studies. Our dataset and source code is available at https://github.com/MICLAB-BUPT/EFA.