From nobody@cs.Buffalo.EDU Mon Jul 21 02:10 EDT 1997
From: nobody@cs.Buffalo.EDU
Date: Mon, 21 Jul 1997 02:10:11 -0400 (EDT)
To: techreps@cs.Buffalo.EDU
Subject: techrep: POST request
Content-Type: text
Content-Length: 3554

Comments: the postscript file of my thesis is saved on hadar
ContactPerson: chi@dis.ricoh.com
Remote host: ppp-232.efaxinc.com
Remote ident: unknown
### Begin Citation ### Do not delete this line ###
%R 97-11
%U ~chifang/thesis.ps
%A Fang, Chi
%T Deciphering Algorithms for Degraded Document Recognition
%D July 17, 1997
%I Department of Computer Science, SUNY Buffalo
%K OCR, document recognition, pattern recognition, deciphering
%Y pattern recognition; text processing
%X The research presented in this thesis provides new solutions to two 
fundamental problems in document recognition. The first problem is 
character segmentation. Touching characters and character fragmentation 
caused by image degradation are difficult problems for current document 
recognition systems. The second problem that this thesis addresses is 
the dependence of today's document recognition systems on extensive 
font training. These two problems are shown to be the main reasons that 
cause performance breakdown for most of today's commercial document 
recognition systems.   
  Our research provides solutions to the two problems by seeking 
alternative approaches that can recognize degraded documents with high 
accuracy and robust performance. Reliable performance on degraded 
documents is achieved by avoiding these two difficult problems in the 
recognition approaches.   
  We propose to consider the computational task of document recognition 
as a process of finding the mapping between the visual patterns in the 
input document and their linguistic identities. Deciphering algorithms 
are proposed to decode the mapping by combining the visual constraints 
from the input document and the various levels of linguistic constraints 
from a language base. This deciphering approach to document recognition 
works on different levels of language granularities, both on the character 
level as well as on the word level. Therefore, document recognition can 
be achieved by decoding characters using character level linguistic 
constraints. It can also be achieved by the direct decryption of words 
using word level linguistic constraints.   
  A modified character-level deciphering algorithm is proposed based on 
an existing research. The modifications to an original algorithm extend 
a character-level deciphering algorithm to documents that contain touching 
characters. It also provides a solution to errors in character pattern 
clustering.    
  A word-level deciphering approach to document recognition is also proposed 
in this thesis. Word-level language constraints are used to first decrypt 
the words from a selected portion of the input text that has relatively 
more reliable word n-gram statistics. Word decryption is considered as a 
process of constraint satisfaction and is implemented by a probabilistic 
relaxation algorithm. After the decryption of words in a selected portion 
of the input text, font information is learned and then used to re-recognize 
the rest of the input text which was not used for deciphering. The 
re-recognition of the rest of the input text is achieved by a hypothesis 
testing algorithm.   
  The applicability of the proposed deciphering approaches to practical 
document recognition tasks is demonstrated by experimental results of 
applying the proposed approaches to scanned documents. The test documents 
used for the experiments contain image degradations including touching 
characters, character fragmentation, as well as individual character 
deformations.