Comments: The submitted document is my PhD dissertation defended on March 28th.
ContactPerson: scha@cse.buffalo.edu
### Begin Citation ### Do not delete this line ###
%R 2001-05
%U /u0/csgrad/scha/ChaThesis.ps.gz
%A Cha, Sung-Hyuk
%T Use of Distance Measures in Handwriting Analysis
%D March 28, 2001
%I Department of Computer Science and Engineering, SUNY Buffalo
%X Algorithmic analysis of human handwriting has many applications such
as in on-line & off-line handwriting recognition, writer
verification, etc.
Each of these tasks involves comparison of different samples of
handwriting. To compare two samples of handwriting requires distance
measures. In this dissertation, several new and old distance
measures appropriate for handwriting analysis are given.
They include element, histogram, probability density function,
string, and convex hull distances.
Results comparing newly defined histogram and string distance measures
with conventional measures are given.
We present several theoretical results and describe applications of
the methods to the domain of on-line & off-line character
recognition and writer verification.
The theoretical results pertain to individuality
validation.
In classification problems such as writer, face, finger print
or speaker identification, the number of classes is very large
or unspecified.
To establish the inherent distinctness of the classes,
i.e., validate individuality, we transform the many class problem into
a dichotomy by using a ``distance'' between two samples of the same
class and those of two different classes.
Based on conjectures derived from experimental observations, we
present theorems comparing polychotomy in feature domain and dichotomy
in distance domain from the view point of tractability vs. accuracy.
The practical application issues include efficient search, writer identification and discovery. First, fast nearest-neighbor algorithms for distance measures are given. We also discuss designing and analyzing an algorithm for writer identification for a known number of writers and its relationship to handwritten document image indexing and retrieval. Finally, we present mining a database consisting of writer data and features obtained from a handwriting sample, statistically representative of the US population, for feature evaluation and to determine similarity of a specific group of people.