Exploring multilingual and contextual properties in word representations from BERT
Master thesis
Published version
Permanent lenke
https://hdl.handle.net/11250/3017423Utgivelsesdato
2022Metadata
Vis full innførselSamlinger
Sammendrag
Nowadays, contextual language models can solve a wide range of language tasks such as text classification, question answering and machine translation. These tasks often require the model to have knowledge about general language understanding, like how words relate to each other. This understanding is acquired through a pre-training stage where the model learn features from raw text data. However, we do not fully understand all the features the model learns through this pre-training stage. Does there exists information yet to be utilized? Can we make predictions more explainable? This thesis aims to extend the knowledge of what features a language model have acquired. We have chosen the model architecture BERT and have analyzed its word representations from two feature perspectives. The first perspective investigated similarities and dissimilarities between English and Norwegian word representations by evaluating their performance on a word retrieval task and a language detection task. The second perspective analyzed how a word representation changes if the word stands in the wrong context or if the word was inferred through the model without context.