Contextualized embeddings encode knowledge of English verb-noun combination idiomaticity

dc.contributor.advisorCook, Paul
dc.contributor.authorFakharian, Samin
dc.date.accessioned2023-03-01T16:47:51Z
dc.date.available2023-03-01T16:47:51Z
dc.date.issued2021
dc.date.updated2023-03-01T15:03:17Z
dc.description.abstractEnglish verb-noun combinations (VNCs) consist of a verb with a noun in its direct object position, and can be used as idioms or as literal combinations (e.g., hit the road). As VNCs are commonly used in language and their meaning is often not predictable, they are an essential topic of research for NLP. In this study, we propose a supervised approach to distinguish idiomatic and literal usages of VNCs in a text based on contextualized representations, specifically BERT and RoBERTa. We show that this model using contextualized embeddings outperforms previous approaches, including the case that the model is tested on instances of VNC types that were not observed during training. We further consider the incorporation of linguistic knowledge of lexico-syntactic fixedness of VNCs into our model. Our findings indicate that contextualized embeddings capture this information.
dc.description.copyright©Samin Fakharian, 2021
dc.description.noteElectronic Only.
dc.formattext/xml
dc.format.extentviii, 59 pages
dc.format.mediumelectronic
dc.identifier.urihttps://unbscholar.lib.unb.ca/handle/1882/14495
dc.language.isoen_CA
dc.publisherUniversity of New Brunswick
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.subject.disciplineComputer Science
dc.titleContextualized embeddings encode knowledge of English verb-noun combination idiomaticity
dc.typemaster thesis
thesis.degree.disciplineComputer Science
thesis.degree.fullnameMaster of Computer Science
thesis.degree.grantorUniversity of New Brunswick
thesis.degree.levelmasters
thesis.degree.nameM.C.S.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
item.pdf
Size:
725.29 KB
Format:
Adobe Portable Document Format