Natural Language Morphology Representation

dc.contributor.authorRoussillon, Mikael, P., A.
dc.date.accessioned2023-03-01T18:27:02Z
dc.date.available2023-03-01T18:27:02Z
dc.date.issued2004
dc.description.abstractThis thesis defines Lightweight Morphology, an alternative to stemming, which creates inflection and derivations from an input word using (1) a set of pattern matching rules producing morphological variants, (2) rules in Java and (3) an exception table to handle exceptions of a language. A language (LiteMorph) was developed to represent natural language morphology specifications for Lightweight Morphology. A French specification was created using LiteMorph, requiring 526 rules, 41 rule sets and 16,842 exception table words. A comparison between an exact query, stemming and Lightweight Morphology was performed. Using a differential recall measure on a collection of 533 documents (Hansard proceedings of the 36th parliament of Canada), we showed that Lightweight Morphology has, on average, 3.9 times more queries retrieving fewer irrelevant documents than stemming. The French version has, on average, 2.5 times more queries retrieving more relevant documents compared to stemming. Two new measures (reflexivity and transitivity) of morphological consistency were defined and tested. The English and French LangLMs have reflexivity scores around 0.9 and transitivity scores under 0.09.
dc.description.copyrightCopyright @ Mikael P. A. Roussillon, 2004
dc.identifier.urihttps://unbscholar.lib.unb.ca/handle/1882/14681
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.subject.disciplineComputer Science
dc.titleNatural Language Morphology Representation
dc.typetechnical report

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
item.pdf
Size:
903.63 KB
Format:
Adobe Portable Document Format

Collections