iconLogo
Published:2025/10/23 6:25:10

感情認識、レベル爆上がり!CMCモデルで感情をキャリブレーション✨

超要約: マルチモーダル感情認識を、CMCモデルで精度UP!ギャルでも分かるように解説するよ💖

✨ ギャル的キラキラポイント ✨

● テキスト、音声、画像…色んな情報から感情を読み解くのがスゴくない?😎 ● モデル同士がケンカしないように、仲良く情報共有する仕組みが天才的💖 ● AIが感情を理解してくれる時代が来るって、マジで未来じゃん?🤩

詳細解説いくよ~!

続きは「らくらく論文」アプリで

Calibrating Multimodal Consensus for Emotion Recognition

Guowei Zhong / Junjie Li / Huaiyu Zhu / Ruohong Huan / Yun Pan

In recent years, Multimodal Emotion Recognition (MER) has made substantial progress. Nevertheless, most existing approaches neglect the semantic inconsistencies that may arise across modalities, such as conflicting emotional cues between text and visual inputs. Besides, current methods are often dominated by the text modality due to its strong representational capacity, which can compromise recognition accuracy. To address these challenges, we propose a model termed Calibrated Multimodal Consensus (CMC). CMC introduces a Pseudo Label Generation Module (PLGM) to produce pseudo unimodal labels, enabling unimodal pretraining in a self-supervised fashion. It then employs a Parameter-free Fusion Module (PFM) and a Multimodal Consensus Router (MCR) for multimodal finetuning, thereby mitigating text dominance and guiding the fusion process toward a more reliable consensus. Experimental results demonstrate that CMC achieves performance on par with or superior to state-of-the-art methods across four datasets, CH-SIMS, CH-SIMS v2, CMU-MOSI, and CMU-MOSEI, and exhibits notable advantages in scenarios with semantic inconsistencies on CH-SIMS and CH-SIMS v2. The implementation of this work is publicly accessible at https://github.com/gw-zhong/CMC.

cs / cs.CV / cs.CL / cs.LG / cs.MM