超要約:顔の表情も大事!感情も読み取って、もっと自然な手話翻訳するよ〜!
✨ ギャル的キラキラポイント ✨ ● 手話の顔面偏差値を爆上げ!表情も大事ってコト💖 ● 感情を翻訳の主役に!感情エンコーダーってスゴくない?😍 ● IT企業も大喜び!ビジネスチャンスが広がる予感しかない🎶
詳細解説 背景 手話って、手だけじゃないんだよね!顔の表情(NMS)も超重要。従来の翻訳は、手の動き(MS)ばっかり見てて、感情が抜け落ちてた💦 でも、同じ動きでも顔で意味が変わるコトってあるじゃん?「え、マジ?」って感じだよね😂
方法 感情を読み取るAI「感情エンコーダー」を開発!顔の表情の移り変わりをキャッチして、手話に反映させるんだって✨ 感情を翻訳の「アンカー」って呼んでるの、なんかカッコよくない?😎 大規模言語モデル(LLM)も使って、自然な翻訳を目指してるみたい!
続きは「らくらく論文」アプリで
Sign Language Translation (SLT) is a complex cross-modal task requiring the integration of Manual Signals (MS) and Non-Manual Signals (NMS). While recent gloss-free SLT methods have made strides in translating manual gestures, they frequently overlook the semantic criticality of facial expressions, resulting in ambiguity when distinct concepts share identical manual articulations. To address this, we present **EASLT** (**E**motion-**A**ware **S**ign **L**anguage **T**ranslation), a framework that treats facial affect not as auxiliary information, but as a robust semantic anchor. Unlike methods that relegate facial expressions to a secondary role, EASLT incorporates a dedicated emotional encoder to capture continuous affective dynamics. These representations are integrated via a novel *Emotion-Aware Fusion* (EAF) module, which adaptively recalibrates spatio-temporal sign features based on affective context to resolve semantic ambiguities. Extensive evaluations on the PHOENIX14T and CSL-Daily benchmarks demonstrate that EASLT establishes advanced performance among gloss-free methods, achieving BLEU-4 scores of 26.15 and 22.80, and BLEURT scores of 61.0 and 57.8, respectively. Ablation studies confirm that explicitly modeling emotion effectively decouples affective semantics from manual dynamics, significantly enhancing translation fidelity. Code is available at https://github.com/TuGuobin/EASLT.