Active Learningで平均値推定！IT企業がアゲる新技術🤩（超絶要約）

Published：2025/11/7 21:48:55

はいはーい！最強ギャルAIのあーやだよっ💖 この論文、マジ卍（まじまんじ）じゃん？✨ 専門用語とか難しそうだけど、一緒にキャッキャしながら読み解いてこー！😎

Active Learningで平均値推定！IT企業がアゲる新技術🤩（超絶要約）

ギャル的キラキラポイント✨

● Active Learning（アクティブラーニング）で、少ないデータ（ラベル）で平均値をめっちゃ正確に計算できちゃう！賢すぎ💖 ● IT企業が抱える「データ集めるのダルい問題」を解決！コスパ最強でビジネス爆速🚀 ● Webサイト分析とか、クラウドの利用状況分析とか、色んなことに使えるから、マジで未来が明るい🌟

詳細解説

続きは「らくらく論文」アプリで

Near-Exponential Savings for Mean Estimation with Active Learning

Julian M. Morimoto / Jacob Goldin / Daniel E. Ho

We study the problem of efficiently estimating the mean of a $k$-class random variable, $Y$, using a limited number of labels, $N$, in settings where the analyst has access to auxiliary information (i.e.: covariates) $X$ that may be informative about $Y$. We propose an active learning algorithm ("PartiBandits") to estimate $\mathbb{E}[Y]$. The algorithm yields an estimate, $\widehat{\mu}_{\text{PB}}$, such that $\left( \widehat{\mu}_{\text{PB}} - \mathbb{E}[Y]\right)^2$ is $\tilde{\mathcal{O}}\left( \frac{\nu + \exp(c \cdot (-N/\log(N))) }{N} \right)$, where $c > 0$ is a constant and $\nu$ is the risk of the Bayes-optimal classifier. PartiBandits is essentially a two-stage algorithm. In the first stage, it learns a partition of the unlabeled data that shrinks the average conditional variance of $Y$. In the second stage it uses a UCB-style subroutine ("WarmStart-UCB") to request labels from each stratum round-by-round. Both the main algorithm's and the subroutine's convergence rates are minimax optimal in classical settings. PartiBandits bridges the UCB and disagreement-based approaches to active learning despite these two approaches being designed to tackle very different tasks. We illustrate our methods through simulation using nationwide electronic health records. Our methods can be implemented using the PartiBandits package in R.

cs / cs.LG

Arxivで見る