CSE 675, Spring 2000
HW #4: MORPHOLOGY AND FINITE-STATE TRANSDUCERS
(revised)
-
J&M, Ch. 3, p. 89: #3.6: Implement the Soundex algorithm by drawing
an approprate TN.
-
Here is the stemming algorithm for `-ly' words from Smith, George W.
(1991),
Computers and Human Language (New York:
Oxford
University Press): 26, Fig. 4:
IF the word ends -ly
THEN remove the -ly and
IF the remaining string does not end -i
THEN IF string doesn't match a stem
THEN add -le and
IF string matches a stem
THEN yes!
ELSE end
ELSE yes!
ELSE change -i to y and
IF string does not match a stem,
THEN end
ELSE yes!
ELSE IF string matches a stem
ELSE end
-
Test it on the following words:
mainly, easily, ably, rely, belly, oily,
early, fly
-
Describe any problems you find, and write a new algorithm that fixes
them.
-
Play with Englex to find its capabilities and limitations. Discuss your
findings. Hand in
sample runs to justify your conclusions.
DUE: MONDAY, FEBRUARY 14
Copyright © 2000 by
William J. Rapaport
(rapaport@cse.buffalo.edu)
file: 675w/hw4.14fb00.html