Contextualized embeddings encode knowledge of English verb-noun combination idiomaticity
Loading...
Files
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of New Brunswick
Abstract
English verb-noun combinations (VNCs) consist of a verb with a noun in its direct object position, and can be used as idioms or as literal combinations (e.g., hit the road). As VNCs are commonly used in language and their meaning is often not predictable, they are an essential topic of research for NLP. In this study, we propose a supervised approach to distinguish idiomatic and literal usages of VNCs in a text based on contextualized representations, specifically BERT and RoBERTa. We show that this model using contextualized embeddings outperforms previous approaches, including the case that the model is tested on instances of VNC types that were not observed during training. We further consider the incorporation of linguistic knowledge of lexico-syntactic fixedness of VNCs into our model. Our findings indicate that contextualized embeddings capture this information.