最強！ニューラルネットの幅を最小化✨

Published：2025/12/25 18:34:46

最強！ニューラルネットの幅を最小化✨

タイトル & 超要約 最小幅ニューラルネット🚀IT企業の未来を変える！
ギャル的キラキラポイント✨ ● 幅が最小限なのに表現力が最強って、まるで小顔効果抜群コスメ💄みたい！ ● Leaky ReLU（リーキー・レルー）とかいう新しい関数も登場！名前もカワイイ🩷 ● エッジAIとか、IoTにも応用できるとか、まじ未来しか感じない💖
詳細解説
- 背景ニューラルネット（脳みその模倣🧠）はすごいけど、デカすぎると計算大変じゃん？🤔 この研究は、小さくても高性能なネット作れないか？って話。無駄を省いて、賢さはそのまま！
- 方法ニューラルネットの「幅」（パラメータ数的な？）をめっちゃ絞って、Leaky ReLUとかいう新しい活性化関数で、いろんな関数を近似できるか実験！📐 最小幅でもイケるか検証！
- 結果幅を最小にしても、結構イケるってことが判明！✨ 計算コストも減らせるし、スマホ📱とかでも動くかも！表現力もアップするって、マジ神じゃん！
- 意義（ここがヤバい♡ポイント） IT企業にとっては、モデルが軽くなるから、色んなデバイスでAI使えるようになるってこと！💸 コスト削減、リアルタイム処理も可能！新しいサービス爆誕の予感…！
リアルでの使いみちアイデア💡
- スマホでサクサク動く画像認識アプリ📸！
- AIチャットボットがもっと賢くなって、おしゃべりもスムーズに！🗣️

続きは「らくらく論文」アプリで

New advances in universal approximation with neural networks of minimal width

Dennis Rochau / Robin Chan / Hanno Gottschalk

We prove several universal approximation results at minimal or near-minimal width for approximation of $L^p(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$ and $C^0(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$ on compact sets. Our approach uses a unified coding scheme that yields explicit constructions relying only on standard analytic tools. We show that feedforward neural networks with two leaky ReLU activations $\sigma_\alpha$, $\sigma_{-\alpha}$ achieve the optimal width $\max\{d_x, d_y\}$ for $L^p$ approximation, while a single leaky ReLU $\sigma_\alpha$ achieves width $\max\{2, d_x, d_y\}$, providing an alternative proof of the results of Cai et al. (2023). By generalizing to stepped leaky ReLU activations, we extend these results to uniform approximation of continuous functions while identifying sets of activation functions compatible with gradient-based training. Since our constructions pass through an intermediate dimension of one, they imply that autoencoders with a one-dimensional feature space are universal approximators. We further show that squashable activations combined with FLOOR achieve width $\max\{3, d_x, d_y\}$ for uniform approximation. We also establish a lower bound of $\max\{d_x, d_y\} + 1$ for networks when all activations are continuous and monotone and $d_y \leq 2d_x$. Moreover, we extend our results to invertible LU-decomposable networks, proving distributional universal approximation for LU-Net normalizing flows and providing a constructive proof of the classical theorem of Brenier and Gangbo on $L^p$ approximation by diffeomorphisms.

cs / cs.NE / math.FA

Arxivで見る