Determining if this word is used like that word: predicting usage similarity with supervised and unsupervised approaches
dc.contributor.advisor | Cook, Paul | |
dc.contributor.author | King, Milton | |
dc.date.accessioned | 2023-03-01T16:16:20Z | |
dc.date.available | 2023-03-01T16:16:20Z | |
dc.date.issued | 2017 | |
dc.date.updated | 2019-05-17T00:00:00Z | |
dc.description.abstract | Determining the meaning of a word in context is an important task for a variety of natural language processing applications such as translating between languages, summarizing paragraphs, and phrase completion. Usage similarity (USim) is an approach to describe the meaning of a word in context that does not rely on a sense inventory -- a set of dictionary-like definitions. Instead, pairs of usages of a target word are rated in terms of their similarity on a scale. In this thesis, we evaluate unsupervised approaches to USim based on embeddings for words, contexts, and sentences, and achieve state-of-the-art results over two USim datasets. We further consider supervised approaches to USim, and find that they can increase the performance of our models. We look into a more detailed evaluation, observing the performance on different parts-of-speech as well as the change in performance when using different features. Our models also do competitively well in two word sense induction tasks, which involve clustering instances of a word based on the meaning of the word in context. | |
dc.description.copyright | ©Milton King, 2017 | |
dc.description.note | M.C.S. University of New Brunswick, Faculty of Computer Science, 2017. | |
dc.format | text/xml | |
dc.format.extent | viii, 75 pages | |
dc.format.medium | electronic | |
dc.identifier.other | Thesis 10088 | |
dc.identifier.uri | https://unbscholar.lib.unb.ca/handle/1882/13204 | |
dc.language.iso | en_CA | |
dc.publisher | University of New Brunswick | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.subject.classification | Word Sense Disambiguation. | |
dc.subject.discipline | Computer Science | |
dc.subject.lcsh | Natural language processing (Computer science) | |
dc.subject.lcsh | Learning classifier systems. | |
dc.subject.lcsh | Semantics -- Data processing. | |
dc.subject.lcsh | Ambiguity -- Data processing. | |
dc.subject.lcsh | Languages, Modern -- Idioms -- Data processing. | |
dc.subject.lcsh | Languages, Modern -- Terms and phrases -- Data processing. | |
dc.subject.lcsh | Word recognition -- Data processing. | |
dc.subject.lcsh | Supervised learning (Machine learning) | |
dc.subject.lcsh | Discourse analysis -- Data processing. | |
dc.subject.lcsh | Neural networks (Computer science) | |
dc.subject.lcsh | Computational linguistics. | |
dc.title | Determining if this word is used like that word: predicting usage similarity with supervised and unsupervised approaches | |
dc.type | master thesis | |
thesis.degree.discipline | Computer Science | |
thesis.degree.fullname | Master of Computer Science | |
thesis.degree.grantor | University of New Brunswick | |
thesis.degree.level | masters | |
thesis.degree.name | M.C.S. |
Files
Original bundle
1 - 1 of 1