iconLogo
Published:2026/1/2 16:10:08

手書き試験をLLMで採点!IT業界向け超革命✨ (15字)

1. ギャル的キラキラポイント✨

  • 手書き答案をAIが秒速採点!時間もコストも爆減だよ💖
  • GPT-5.2が神レベルの精度!人間よりスゴイかも😳
  • 回路図とか図解もOK!複雑な答案もイケちゃう😎

2. 詳細解説

  • 背景: STEM分野(理系)の手書き試験、採点ダルくない?💦 時間かかるし、先生によって点数違うとかあるあるだよね!
  • 方法: マルチモーダルLLM(画像と文章を理解するAI)で、手書き答案を丸っと採点しちゃう!✨OCR(文字認識)の精度に左右されないのがポイント!
  • 結果: GPT-5.2ってLLMがマジでスゴくて、ほぼ人間レベルの精度で採点できることが判明🎉 回路図とかもちゃんと理解するよ!
  • 意義: IT業界、教育業界をぶち上げる可能性大!採点コスト削減、教育の質の向上、AI教育の促進…未来が明るすぎる💖

続きは「らくらく論文」アプリで

Grading Handwritten Engineering Exams with Multimodal Large Language Models

Janez Per\v{s} / Jon Muhovi\v{c} / Andrej Ko\v{s}ir / Bo\v{s}tjan Murovec

Handwritten STEM exams capture open-ended reasoning and diagrams, but manual grading is slow and difficult to scale. We present an end-to-end workflow for grading scanned handwritten engineering quizzes with multimodal large language models (LLMs) that preserves the standard exam process (A4 paper, unconstrained student handwriting). The lecturer provides only a handwritten reference solution (100%) and a short set of grading rules; the reference is converted into a text-only summary that conditions grading without exposing the reference scan. Reliability is achieved through a multi-stage design with a format/presence check to prevent grading blank answers, an ensemble of independent graders, supervisor aggregation, and rigid templates with deterministic validation to produce auditable, machine-parseable reports. We evaluate the frozen pipeline in a clean-room protocol on a held-out real course quiz in Slovenian, including hand-drawn circuit schematics. With state-of-the-art backends (GPT-5.2 and Gemini-3 Pro), the full pipeline achieves $\approx$8-point mean absolute difference to lecturer grades with low bias and an estimated manual-review trigger rate of $\approx$17% at $D_{\max}=40$. Ablations show that trivial prompting and removing the reference solution substantially degrade accuracy and introduce systematic over-grading, confirming that structured prompting and reference grounding are essential.

cs / cs.CV