IDENTIFYING HATE SPEECH IN SOCIAL MEDIA THROUGH CONTENT AND SOCIAL CONNECTIONS ANALYSIS

dc.contributor.authorStanišić, Milan
dc.contributor.departmentUniversity of Gothenburg / Department of Philosophy,Lingustics and Theory of Scienceeng
dc.contributor.departmentGöteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriswe
dc.date.accessioned2023-06-19T09:09:23Z
dc.date.available2023-06-19T09:09:23Z
dc.date.issued2023-06-19
dc.description.abstractHate speech is a problem which puts its targets at risk of serious harm. It spreads fast and has a real influence on the society because of the ubiquity of the internet and social media, and so various research efforts have been put to find solutions to automatic hate speech detection. Despite major developments in the field, challenges with data scarcity and characteristics often cause solutions reported in previous research to overfit the datasets that were used to train and test them, which results in dramatic performance losses and failures in generalization. This study addressed this issue, it tried to find a solution that would mitigate overfitting effects originating from these issues and enhance language-based classifier with extra user information concerning one’s social connections. It compared two single-source models – one based on textual information, and the other based on information concerning one’s social connections and proposed a joint decision engine that selects the model whose class assignment was more certain for a given instance. Although the single-source models’ performance dropped drastically on test data, the joint decision engine succeeded in reducing some of the issues related to overfitting, improving the overall performance. This observation suggests that simple solutions might be efficient in reducing model overfit and paves the way towards validating these findings.en
dc.identifier.urihttps://hdl.handle.net/2077/77243
dc.language.isoengen
dc.setspec.uppsokHumanitiesTheology
dc.subjecthate speech, social media, natural language processing, classificationen
dc.titleIDENTIFYING HATE SPEECH IN SOCIAL MEDIA THROUGH CONTENT AND SOCIAL CONNECTIONS ANALYSISen
dc.typeText
dc.type.degreeStudent essay
dc.type.uppsokH2

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Thesis_Milan_Stanisic_LT2215_official.pdf
Size:
363.01 KB
Format:
Adobe Portable Document Format
Description:
Master thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.68 KB
Format:
Item-specific license agreed upon to submission
Description: