From nobody@cs.Buffalo.EDU Mon Jul 21 02:10 EDT 1997 From: nobody@cs.Buffalo.EDU Date: Mon, 21 Jul 1997 02:10:11 -0400 (EDT) To: techreps@cs.Buffalo.EDU Subject: techrep: POST request Content-Type: text Content-Length: 3554 Comments: the postscript file of my thesis is saved on hadar ContactPerson: chi@dis.ricoh.com Remote host: ppp-232.efaxinc.com Remote ident: unknown ### Begin Citation ### Do not delete this line ### %R 97-11 %U ~chifang/thesis.ps %A Fang, Chi %T Deciphering Algorithms for Degraded Document Recognition %D July 17, 1997 %I Department of Computer Science, SUNY Buffalo %K OCR, document recognition, pattern recognition, deciphering %Y pattern recognition; text processing %X The research presented in this thesis provides new solutions to two fundamental problems in document recognition. The first problem is character segmentation. Touching characters and character fragmentation caused by image degradation are difficult problems for current document recognition systems. The second problem that this thesis addresses is the dependence of today's document recognition systems on extensive font training. These two problems are shown to be the main reasons that cause performance breakdown for most of today's commercial document recognition systems. Our research provides solutions to the two problems by seeking alternative approaches that can recognize degraded documents with high accuracy and robust performance. Reliable performance on degraded documents is achieved by avoiding these two difficult problems in the recognition approaches. We propose to consider the computational task of document recognition as a process of finding the mapping between the visual patterns in the input document and their linguistic identities. Deciphering algorithms are proposed to decode the mapping by combining the visual constraints from the input document and the various levels of linguistic constraints from a language base. This deciphering approach to document recognition works on different levels of language granularities, both on the character level as well as on the word level. Therefore, document recognition can be achieved by decoding characters using character level linguistic constraints. It can also be achieved by the direct decryption of words using word level linguistic constraints. A modified character-level deciphering algorithm is proposed based on an existing research. The modifications to an original algorithm extend a character-level deciphering algorithm to documents that contain touching characters. It also provides a solution to errors in character pattern clustering. A word-level deciphering approach to document recognition is also proposed in this thesis. Word-level language constraints are used to first decrypt the words from a selected portion of the input text that has relatively more reliable word n-gram statistics. Word decryption is considered as a process of constraint satisfaction and is implemented by a probabilistic relaxation algorithm. After the decryption of words in a selected portion of the input text, font information is learned and then used to re-recognize the rest of the input text which was not used for deciphering. The re-recognition of the rest of the input text is achieved by a hypothesis testing algorithm. The applicability of the proposed deciphering approaches to practical document recognition tasks is demonstrated by experimental results of applying the proposed approaches to scanned documents. The test documents used for the experiments contain image degradations including touching characters, character fragmentation, as well as individual character deformations.