石碑文の文字認識を爆上げ！コンテキスト（文脈）考慮で二値化するよ☆

Published：2026/1/7 5:37:29

石碑文の文字、AIで読み解く！✨

石碑文の文字認識を爆上げ！コンテキスト（文脈）考慮で二値化するよ☆

ギャル的キラキラポイント✨

● 石碑文（せきひぶん）の画像（がぞう）から、文字をくっきり抽出（ちゅうしゅつ）する技術💖 ● 文字の大きさとか配置（はいち）に合わせて、最適なやり方で文字を認識するんだって！😳 ● AI（エーアイ）が、背景ノイズ（はいけいノイズ）を無視して文字だけ見つけるよ👀

詳細解説

背景石碑文って、歴史（れきし）を伝える大事なものだけど、写真撮ると文字が見えづらいこと、あるよね？😢 表面がボロボロだったり、光の加減（かげん）でコントラスト（コントラスト）が弱かったり…💦 従来のAIじゃ、文字をうまく認識できなかったんだって！

方法そこで登場（とうじょう）したのが、この研究！✨ 文字の周りの情報（じょうほう）を考慮（こうりょ）する「パッチング戦略」と、AIモデル「Attention U-Net」を組み合わせたんだって！🧐パッチング戦略で、文字のサイズとか配置を分析（ぶんせき）して、一番良い方法で文字を抽出するんだって！

続きは「らくらく論文」アプリで

Unveiling Text in Challenging Stone Inscriptions: A Character-Context-Aware Patching Strategy for Binarization

Pratyush Jena / Amal Joseph / Arnav Sharma / Ravi Kiran Sarvadevabhatla

Binarization is a popular first step towards text extraction in historical artifacts. Stone inscription images pose severe challenges for binarization due to poor contrast between etched characters and the stone background, non-uniform surface degradation, distracting artifacts, and highly variable text density and layouts. These conditions frequently cause existing binarization techniques to fail and struggle to isolate coherent character regions. Many approaches sub-divide the image into patches to improve text fragment resolution and improve binarization performance. With this in mind, we present a robust and adaptive patching strategy to binarize challenging Indic inscriptions. The patches from our approach are used to train an Attention U-Net for binarization. The attention mechanism allows the model to focus on subtle structural cues, while our dynamic sampling and patch selection method ensures that the model learns to overcome surface noise and layout irregularities. We also introduce a carefully annotated, pixel-precise dataset of Indic stone inscriptions at the character-fragment level. We demonstrate that our novel patching mechanism significantly boosts binarization performance across classical and deep learning baselines. Despite training only on single script Indic dataset, our model exhibits strong zero-shot generalization to other Indic and non-indic scripts, highlighting its robustness and script-agnostic generalization capabilities. By producing clean, structured representations of inscription content, our method lays the foundation for downstream tasks such as script identification, OCR, and historical text analysis. Project page: https://ihdia.iiit.ac.in/shilalekhya-binarization/

cs / cs.CV

Arxivで見る