Comments: The submitted document is my PhD dissertation defended on March 28th.
ContactPerson: scha@cse.buffalo.edu
### Begin Citation ### Do not delete this line ###
%R 2001-05
%U /u0/csgrad/scha/ChaThesis.ps.gz
%A Cha, Sung-Hyuk
%T Use of Distance Measures in Handwriting Analysis
%D March 28, 2001
%I Department of Computer Science and Engineering, SUNY Buffalo
%X Algorithmic analysis of human handwriting has many applications such  
as in <em> on-line & off-line handwriting recognition,  writer  
verification</em>, etc.  
Each of these tasks involves comparison of different samples of 
handwriting. To compare two samples of handwriting requires distance 
measures.  In this dissertation, several new and old distance  
measures appropriate for <em>handwriting analysis</em> are given. 
They include element, histogram, probability density function, 
string, and convex hull distances.    
    Results comparing newly defined histogram and string distance measures 
    with conventional measures are given.  
    We present several theoretical results and describe applications of 
    the methods to the domain of <em> on-line & off-line character 
    recognition</em> and <em> writer verification</em>.  
    <br><p>   
    The theoretical results pertain to <em> individuality 
    validation</em>.  
    In classification problems such as  <em> writer, face, finger print</em> 
    or <em> speaker identification</em>, the number of classes is very large 
    or unspecified.   
    To establish the inherent distinctness of the classes, 
    i.e., validate individuality, we transform the many class problem into 
    a dichotomy by using a ``distance'' between two samples of the same 
    class and those of two different classes. 
    Based on conjectures derived from experimental observations, we 
    present theorems comparing polychotomy in feature domain and dichotomy 
    in distance domain  from the view point of tractability vs. accuracy.  
    <br><p>   
    The practical application issues include <em> efficient search, writer 
    identification</em> and <em> discovery</em>.  
    First, fast nearest-neighbor algorithms for distance measures are 
    given.  We also discuss designing and analyzing an algorithm for <em> 
    writer identification</em> for a known number of writers and its 
    relationship to <em> handwritten document image indexing and 
    retrieval</em>.   
    Finally, we present mining a database consisting of writer data and 
    features obtained from a handwriting sample, statistically 
    representative of the US population,  for feature evaluation and   
    to determine similarity of a specific group of people.