超要約:長文エッセイ(作文)をAIが採点する技術を研究したよ!
● AIが長文(ちょうぶん)エッセイを採点(さいてん)してくれるって、超便利じゃん?😍 ● 色んなAIモデルを組み合わせて、より正確(せいかく)に採点できるようにしたんだって!😎 ● IT企業が教育(きょういく)とかコンテンツ作成(さくせい)に役立てられるね!💖
詳細解説
背景 長文エッセイの採点って、先生たち大変じゃん?😫 時間かかるし、大変だし…。それをAIがやってくれたら、めっちゃ楽になるよね!✨ この研究は、そのための技術を開発したってわけ💖
続きは「らくらく論文」アプリで
Long context may impose challenges for encoder-only language models in text processing, specifically for automated scoring of essays. This study trained several commonly used encoder-based language models for automated scoring of long essays. The performance of these trained models was evaluated and compared with the ensemble models built upon the base language models with a token limit of 512?. The experimented models include BERT-based models (BERT, RoBERTa, DistilBERT, and DeBERTa), ensemble models integrating embeddings from multiple encoder models, and ensemble models of feature-based supervised machine learning models, including Gradient-Boosted Decision Trees, eXtreme Gradient Boosting, and Light Gradient Boosting Machine. We trained, validated, and tested each model on a dataset of 17,307 essays, with an 80%/10%/10% split, and evaluated model performance using Quadratic Weighted Kappa. This study revealed that an ensemble-of-embeddings model that combines multiple pre-trained language model representations with gradient-boosting classifier as the ensemble model significantly outperforms individual language models at scoring long essays.