Contextualized embeddings encode knowledge of English verb-noun combination idiomaticity

Fakharian, Samin

Contextualized embeddings encode knowledge of English verb-noun combination idiomaticity

Files

item.pdf (725.29 KB)

Date

2021

Authors

Fakharian, Samin

Publisher

University of New Brunswick

Abstract

English verb-noun combinations (VNCs) consist of a verb with a noun in its direct object position, and can be used as idioms or as literal combinations (e.g., hit the road). As VNCs are commonly used in language and their meaning is often not predictable, they are an essential topic of research for NLP. In this study, we propose a supervised approach to distinguish idiomatic and literal usages of VNCs in a text based on contextualized representations, specifically BERT and RoBERTa. We show that this model using contextualized embeddings outperforms previous approaches, including the case that the model is tested on instances of VNC types that were not observed during training. We further consider the incorporation of linguistic knowledge of lexico-syntactic fixedness of VNCs into our model. Our findings indicate that contextualized embeddings capture this information.

URI

https://unbscholar.lib.unb.ca/handle/1882/14495

Collections

Open Theses & Dissertations

Full item page

Contextualized embeddings encode knowledge of English verb-noun combination idiomaticity

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections