Show simple item record

dc.contributor.authorLindahl, Anna
dc.date.accessioned2018-01-15T10:45:46Z
dc.date.available2018-01-15T10:45:46Z
dc.date.issued2018-01-15
dc.identifier.urihttp://hdl.handle.net/2077/54947
dc.description.abstractThis work investigates how the method of topic modeling can be applied to investigate the public discourse of Swedish housing policies. The data used to represent this discourse is both from the Swedish parliament, the Riksdag, and Swedish newstexts. The lack of housing and current housing crisis in Sweden makes this a relevant area to study. Topic modeling is an unsupervised probabilistic method for finding topics in large collections of data. This is a popular method for examining public discourse, however there is a lack of including linguistic information in the preprocessing steps of it. Therefore, this work also investigates what effect linguistically informed preprocessing has on topic modeling. Three types of linguistic information are selected and investigated. These are part of speech, dependency relations and lemmatization. Based on these, filters are created for the data. The filters are applied to a test set (a subset of the original data), and a topic model is trained on each filtered version of this test set. The resulting topics from each model are evaluated by both humans and the computational methods perplexity and semantic coherence, and the results from the respective evaluation methods are compared. The semantic coherence named cv is found to have a higher correlation with human ratings than the npmi coherence. Perplexity is found to not correlate well with human ratings. Filtering the data based on part of speech is found to most improve the topic quality. Non-lemmatized topics are found to be rated higher than lemmatized topics. Topics from the filters based on dependency relations are found to have low ratings. Based on the human ratings, an optimum model for respective data set is chosen. The selected topic models are applied to the data, and the results are used for to exemplify how one can use them for analysis. Topic modeling is found to be a suitable method for the intended analysis.sv
dc.language.isoengsv
dc.subjecttopic modelingsv
dc.subjectpublic discoursesv
dc.subjecthousing policiessv
dc.subjectLDAsv
dc.subjectsemantic coherence measuressv
dc.subjectpart of speechsv
dc.titleTOPIC MODELING FOR ANALYSIS OF PUBLIC DISCOURSE -Enriching topic modeling with linguistic information to analyze Swedish housing policiessv
dc.title.alternativeTOPIC MODELING FOR ANALYSIS OF PUBLIC DISCOURSE -Enriching topic modeling with linguistic information to analyze Swedish housing policiessv
dc.typeText
dc.setspec.uppsokHumanitiesTheology
dc.type.svepH2
dc.contributor.departmentGöteborgs universitet/Institutionen för filosofi, lingvistik och vetenskapsteoriswe
dc.contributor.departmentGöteborg University/Department of Philosophy, Linguistics and Theory of Scienceeng
dc.type.degreeStudent essay


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record