Show simple item record

dc.contributor.authorRørmann Olsen, Ida
dc.date.accessioned2018-12-13T11:15:40Z
dc.date.available2018-12-13T11:15:40Z
dc.date.issued2018-12-13
dc.identifier.urihttp://hdl.handle.net/2077/58385
dc.description.abstractThis thesis describes an approach to handle word sense in natural language processing. If we want language technologies to handle word ambiguity, then machines need proper sense representations. In a case study on Danish ambiguous nouns, we examined the possibility of building an appropriate sense inventory by combining the distributional information of a word from a vector space model with knowledge-based information from a wordnet. We tested three sense representations in a word sense disambiguation task: firstly, the centroids (average of words) of selected wordnet synset information and members, secondly the centroids of wordnet sample sentence alone, and thirdly the centroids of un-labelled sample sentences clustered around the wordnet sample sentence. Finally, we tested the features of the cluster members and evaluation data in supervised machine learning classifiers. The sense representations in all experiments generally beat the random baseline significantly, but not the most frequent sense as default. The representations made from selected wordnet synset information and synset members proved to generally give the best result, especially for those target words with rich synset information. The machine learning classifiers outperformed the sense representations significantly on the word sense disambiguation task. The best classifiers were those trained and tested on either the clustered data or the evaluation data. We conclude that the combination of word embeddings and wordnet associated data used to build a proper sense representation seems promising. However, we suggest some improvements for future work, specifically on the extracted information from wordnet sample sentences.sv
dc.language.isoengsv
dc.subjectsense embeddingssv
dc.subjectwordnetsv
dc.subjectword2vecsv
dc.subjectword sense disambiguationsv
dc.subjectclusteringsv
dc.subjectmachine learningsv
dc.subjectsupervised WSDsv
dc.titleDealing with word ambiguity in NLP. Building appropriate sense representations for Danish sense tagging by combining word embeddings with wordnet sensessv
dc.typeText
dc.setspec.uppsokHumanitiesTheology
dc.type.svepH2
dc.contributor.departmentGöteborgs universitet/Institutionen för filosofi, lingvistik och vetenskapsteoriswe
dc.contributor.departmentGöteborg University/Department of Philosophy, Linguistics and Theory of Scienceeng
dc.type.degreeStudent essay


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record