iconLogo
Published:2025/12/17 14:33:49

動画通信、エネルギー効率も画質も両立! 最強の秘密兵器爆誕☆

I. 研究の概要

  1. 超要約: 動画通信を、セマンティック(意味)で効率化! 低電力&高画質を叶える新技術!

  2. ギャル的キラキラポイント✨

    • ● 動画の「意味」だけ送るって発想が斬新!無駄を省いてエコだね♪
    • ● 激しい動きにも対応! いろんな動画で試せるのがスゴい💖
    • ● 遅延(タイムラグ)が少ないから、動画視聴がもっと快適になるって最高じゃん!
  3. 詳細解説

    • 背景: 動画通信はデータ量が多くて、エネルギーめっちゃ使うのよね…。従来のやり方(画素レベルで全部送る)だと、無駄も多いし、回線も混み合う💦
    • 方法: 動画を「意味」ごとに分解して送る「セマンティック通信」って方法を採用! 新手法(PENME)で、必要な情報だけを効率的に送るんだって!
    • 結果: データ量が減って、消費電力も少なくなる! しかも、高画質&低遅延も実現できるっていう、まさに神✨
    • 意義(ここがヤバい♡ポイント): 動画配信サービス、VR/AR、遠隔医療とか、色んな分野で使える! IT業界がめっちゃ進化する予感!

続きは「らくらく論文」アプリで

GenAI-enabled Residual Motion Estimation for Energy-Efficient Semantic Video Communication

Shavbo Salehi / Pedro Enrique Iturria-Rivera / Medhat Elsayed / Majid Bavand / Yigit Ozcan / Melike Erol-Kantarci

Semantic communication addresses the limitations of the Shannon paradigm by focusing on transmitting meaning rather than exact representations, thereby reducing unnecessary resource consumption. This is particularly beneficial for video, which dominates network traffic and demands high bandwidth and power, making semantic approaches ideal for conserving resources while maintaining quality. In this paper, we propose a Predictability-aware and Entropy-adaptive Neural Motion Estimation (PENME) method to address challenges related to high latency, high bitrate, and power consumption in video transmission. PENME makes per-frame decisions to select a residual motion extraction model, convolutional neural network, vision transformer, or optical flow, using a five-step policy based on motion strength, global motion consistency, peak sharpness, heterogeneity, and residual error. The residual motions are then transmitted to the receiver, where the frames are reconstructed via motion-compensated updates. Next, a selective diffusion-based refinement, the Latent Consistency Model (LCM-4), is applied on frames that trigger refinement due to low predictability or large residuals, while predictable frames skip refinement. PENME also allocates radio resource blocks with awareness of residual motion and channel state, reducing power consumption and bandwidth usage while maintaining high semantic similarity. Our simulation results on the Vimeo90K dataset demonstrate that the proposed PENME method handles various types of video, outperforming traditional communication, hybrid, and adaptive bitrate semantic communication techniques, achieving 40% lower latency, 90% less transmitted data, and 35% higher throughput. For semantic communication metrics, PENME improves PSNR by about 40%, increases MS-SSIM by roughly 19%, and reduces LPIPS by nearly 35%, compared with the baseline methods.

cs / cs.NI