Visa enkel post

dc.contributor.authorRodriguez, David
dc.contributor.authorSaynova, Denitsa
dc.date.accessioned2020-07-08T11:39:08Z
dc.date.available2020-07-08T11:39:08Z
dc.date.issued2020-07-08
dc.identifier.urihttp://hdl.handle.net/2077/65590
dc.description.abstractThis work examines the role of both cross-lingual zero-shot learning and data augmentation in detecting hate speech online for low resource set-ups. The proposed solutions for situations where the amount of labeled data is scarce are to use a language with more resources during training or to create synthetic data points. Cross-lingual zero-shot results suggest some knowledge transfer is occurring. However, results seem greatly influenced by the specific training data set selected. This is further supported by cross-data set experimentation within the same language, where results were also found to fluctuate based on training data without the need for cross-lingual transfer. Meanwhile, data augmentation methods show an improvement, especially for low amounts of data. Furthermore, a detailed discussion on how the proposed data augmentation techniques impact the data is presented in this work.sv
dc.language.isoengsv
dc.relation.ispartofseriesCSE 20-16sv
dc.subjectmachine learningsv
dc.subjectnatural language processingsv
dc.subjectBERTsv
dc.subjectcross-lingual zeroshot learningsv
dc.subjectdata augmentationsv
dc.subjecthate speechsv
dc.subjectclassificationsv
dc.subjectTwittersv
dc.titleMachine Learning for Detecting Hate Speech in Low Resource Languagessv
dc.title.alternativeMachine Learning for Detecting Hate Speech in Low Resource Languagessv
dc.typetext
dc.setspec.uppsokTechnology
dc.type.uppsokH2
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.type.degreeStudent essay


Filer under denna titel

Thumbnail

Dokumentet tillhör följande samling(ar)

Visa enkel post