深層学習の学習を熱力学で解明！✨

Published：2026/1/2 21:48:47

深層学習の学習を熱力学で解明！✨

I. 研究の概要

研究の目的
- 深層学習(Deep Learning)の学習を熱力学で理解
- Deep Linear Network (DLN)に注目
- 学習の効率や安定性を評価する指標を提案
研究の背景
- 深層学習はスゴイけど謎が多い
- 過剰パラメータ化、暗黙のバイアスって何？
- DLNの構造と学習の関係を解明
- 熱力学で学習の振る舞いを理解

II. 研究の詳細

続きは「らくらく論文」アプリで

An entropy formula for the Deep Linear Network

Govind Menon / Tianmin Yu

We study the Riemannian geometry of the Deep Linear Network (DLN) as a foundation for a thermodynamic description of the learning process. The main tools are the use of group actions to analyze overparametrization and the use of Riemannian submersion from the space of parameters to the space of observables. The foliation of the balanced manifold in the parameter space by group orbits is used to define and compute a Boltzmann entropy. We also show that the Riemannian geometry on the space of observables defined in [2] is obtained by Riemannian submersion of the balanced manifold. The main technical step is an explicit construction of an orthonormal basis for the tangent space of the balanced manifold using the theory of Jacobi matrices.

cs / cs.LG / math.DG / math.DS

Arxivで見る