An automatic approach to discover lexical semantic differences in varieties of English

Thumbnail Image



Journal Title

Journal ISSN

Volume Title


University of New Brunswick


The English language is not uniform. Speakers of English in different parts of the world can use the same word, but with different meanings. Investigating lexical semantic differences in varieties of English such as American, Australian, British, Canadian is an interesting area of research in computational linguistics. We use corpora of varieties of English to detect words that changed their meaning from one variety to another. Methods of automatically identifying lexical variation used in this work are the distributional semantic models, measures of keywords, and word embedding models inspired by neural network language models. We determine whether word embedding models can detect lexical semantic differences between varieties of English better than distributional similarity approaches and approaches based on keywords. This study presents the first important step towards a robust application of word embeddings to variational linguistics. Our results indicate that word embeddings perform best among all other methods in 2 out of 3 cases.