Presentation of Degree project E: Word Embeddings and Gender Sterotypes in Swedish and English

  • Date:
  • Location: Ångströmlaboratoriet, Lägerhyddsvägen 1
  • Lecturer: Rasmus Précenth
  • Contact person: David Sumpter
  • Seminarium

Abstract: A word embedding is a representation of words in as vectors. After Mikolov et al. introduced the algorithm word2vec in 2013, the popular- ity of word embeddings exploded. Not only was the new algorithm much more efficient, but it also produced embeddings that exhibited an interest- ing property allowing for reasoning with analogies such as ”he is to king as she is to queen”. Later it was discovered that the embeddings contained different types of biases, such as gender bias. We first look at how word embeddings are constructed and then investigate what it means mathematically to create an analogy between words. We use the techniques earlier applied to English to Swedish. We find that Swedish can be represented just as well as English in an embedding and exhibits many of the same biases. We will try to explain these results  by understanding what word embeddings, word analogies and bias are from a mathematical viewpoint.