Show simple item record

dc.contributor.authorPaniskaki, Kyriaki
dc.contributor.authorHarsha Kadam, Sanjit
dc.date.accessioned2020-07-08T11:30:41Z
dc.date.available2020-07-08T11:30:41Z
dc.date.issued2020-07-08
dc.identifier.urihttp://hdl.handle.net/2077/65588
dc.description.abstractThis master’s thesis studies a multi label text classification task on a small data set of bilingual, English and Swedish, short texts (emails). Specifically, the size of the data set is 5800 emails and those emails are distributed among 107 classes with the special case that the majority of the emails includes the two languages at the same time. For handling this task different models have been employed: Support Vector Machines (SVM), Gated Recurrent Units (GRU), Convolution Neural Network (CNN), Quasi Recurrent Neural Network (QRNN) and Transformers. The experiments demonstrate that in terms of weighted averaged F1 score, the SVM outperforms the other models with a score of 0.96 followed by the CNN with 0.89 and the QRNN with 0.80.sv
dc.language.isoengsv
dc.relation.ispartofseriesCSE 20-14sv
dc.subjectnatural language processingsv
dc.subjectmachine learningsv
dc.subjectmulti label text classificationsv
dc.subjectdeep neural networkssv
dc.subjectbilingual textssv
dc.subjectemailssv
dc.subjectshort textssv
dc.titleText analysis for email multi label classificationsv
dc.title.alternativeText analysis for email multi label classificationsv
dc.typetext
dc.setspec.uppsokTechnology
dc.type.uppsokH2
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.type.degreeStudent essay


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record