iconLogo
Published:2026/1/7 2:01:27

長文エッセイの採点、AIで爆速&激カワに!✨

超要約:長文エッセイ(作文)をAIが採点する技術を研究したよ!

● AIが長文(ちょうぶん)エッセイを採点(さいてん)してくれるって、超便利じゃん?😍 ● 色んなAIモデルを組み合わせて、より正確(せいかく)に採点できるようにしたんだって!😎 ● IT企業が教育(きょういく)とかコンテンツ作成(さくせい)に役立てられるね!💖


詳細解説

背景 長文エッセイの採点って、先生たち大変じゃん?😫 時間かかるし、大変だし…。それをAIがやってくれたら、めっちゃ楽になるよね!✨ この研究は、そのための技術を開発したってわけ💖

続きは「らくらく論文」アプリで

Empirical Comparison of Encoder-Based Language Models and Feature-Based Supervised Machine Learning Approaches to Automated Scoring of Long Essays

Kuo Wang (Southern Methodist University) / Haowei Hua (Princeton University) / Pengfei Yan (University of Maryland) / Hong Jiao (University of Maryland) / Dan Song (University of Iowa)

Long context may impose challenges for encoder-only language models in text processing, specifically for automated scoring of essays. This study trained several commonly used encoder-based language models for automated scoring of long essays. The performance of these trained models was evaluated and compared with the ensemble models built upon the base language models with a token limit of 512?. The experimented models include BERT-based models (BERT, RoBERTa, DistilBERT, and DeBERTa), ensemble models integrating embeddings from multiple encoder models, and ensemble models of feature-based supervised machine learning models, including Gradient-Boosted Decision Trees, eXtreme Gradient Boosting, and Light Gradient Boosting Machine. We trained, validated, and tested each model on a dataset of 17,307 essays, with an 80%/10%/10% split, and evaluated model performance using Quadratic Weighted Kappa. This study revealed that an ensemble-of-embeddings model that combines multiple pre-trained language model representations with gradient-boosting classifier as the ensemble model significantly outperforms individual language models at scoring long essays.

cs / cs.CL / cs.LG