ML公平性評価、比較判断で爆誕💖

Published：2026/1/11 3:39:45

ML公平性評価、比較判断で爆誕💖 （ちょー要約）

1. タイトル & 超要約

公平性評価を比較判断で！コスト削減＆高精度だよ✨

2. ギャル的キラキラポイント✨

● ラベル付け（評価のこと）が不要で、比較するだけ！負担激減なの！🥺
● 心理学のテクニックで、人間の認知的な負担も軽減できちゃう🎵
● 二値分類だけじゃなく、回帰問題（数字を予測するやつ）にも使えるって最強じゃん？😍

続きは「らくらく論文」アプリで

Comparative Separation: Evaluating Separation on Comparative Judgment Test Data

Xiaoyin Xi / Neeku Capak / Kate Stockwell / Zhe Yu

This research seeks to benefit the software engineering society by proposing comparative separation, a novel group fairness notion to evaluate the fairness of machine learning software on comparative judgment test data. Fairness issues have attracted increasing attention since machine learning software is increasingly used for high-stakes and high-risk decisions. It is the responsibility of all software developers to make their software accountable by ensuring that the machine learning software do not perform differently on different sensitive groups -- satisfying the separation criterion. However, evaluation of separation requires ground truth labels for each test data point. This motivates our work on analyzing whether separation can be evaluated on comparative judgment test data. Instead of asking humans to provide the ratings or categorical labels on each test data point, comparative judgments are made between pairs of data points such as A is better than B. According to the law of comparative judgment, providing such comparative judgments yields a lower cognitive burden for humans than providing ratings or categorical labels. This work first defines the novel fairness notion comparative separation on comparative judgment test data, and the metrics to evaluate comparative separation. Then, both theoretically and empirically, we show that in binary classification problems, comparative separation is equivalent to separation. Lastly, we analyze the number of test data points and test data pairs required to achieve the same level of statistical power in the evaluation of separation and comparative separation, respectively. This work is the first to explore fairness evaluation on comparative judgment test data. It shows the feasibility and the practical benefits of using comparative judgment test data for model evaluations.

cs / cs.SE / cs.LG

Arxivで見る