A Keyword Retrieval System for Historical Mongolian Document Images

By | 2014年10月31日 | 1,210 views

Hongxi Wei1   and Guanglai Gao1  (1)School of Computer Science, Inner Mongolia University, Hohhot, 010021, China

Hongxi Wei (Corresponding author)

Email: cswhx@imu.edu.cn

Guanglai Gao

Email: csggl@imu.edu.cnReceived: 14 April 2012

Revised: 8 January 2013

Accepted: 11 February 2013

Published online: 26 February 2013

Abstract

In this paper, we propose a keyword retrieval system for locating words in historical Mongolian document images. Based on the word spotting technology, a collection of historical Mongolian document images is converted into a collection of word images by word segmentation, and a number of profile-based features are extracted to represent word images. For each word image, a fixed-length feature vector is formulated by obtaining the appropriate number of the complex coefficients of discrete Fourier transform on each profile feature. The system supports online image-to-image matching by calculating similarities between a query word image and each word image in the collection, and consequently, a ranked result is returned in descending order of the similarities. Therein, the query word image can be generated by synthesizing a sequence of glyphs when being retrieved. By experimental evaluations, the performance of the system is confirmed.

Keywords Kanjur  – Word spotting  – Profile features  – Discrete Fourier transform  – Query image synthesis

Source: see full text: http://link.springer.com/article/10.1007/s10032-013-0203-6/fulltext.html