Text analysis for email multi label classification

dc.contributor.authorPaniskaki, Kyriaki
dc.contributor.authorHarsha Kadam, Sanjit
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.date.accessioned2020-07-08T11:30:41Z
dc.date.available2020-07-08T11:30:41Z
dc.date.issued2020-07-08
dc.description.abstractThis master’s thesis studies a multi label text classification task on a small data set of bilingual, English and Swedish, short texts (emails). Specifically, the size of the data set is 5800 emails and those emails are distributed among 107 classes with the special case that the majority of the emails includes the two languages at the same time. For handling this task different models have been employed: Support Vector Machines (SVM), Gated Recurrent Units (GRU), Convolution Neural Network (CNN), Quasi Recurrent Neural Network (QRNN) and Transformers. The experiments demonstrate that in terms of weighted averaged F1 score, the SVM outperforms the other models with a score of 0.96 followed by the CNN with 0.89 and the QRNN with 0.80.sv
dc.identifier.urihttp://hdl.handle.net/2077/65588
dc.language.isoengsv
dc.relation.ispartofseriesCSE 20-14sv
dc.setspec.uppsokTechnology
dc.subjectnatural language processingsv
dc.subjectmachine learningsv
dc.subjectmulti label text classificationsv
dc.subjectdeep neural networkssv
dc.subjectbilingual textssv
dc.subjectemailssv
dc.subjectshort textssv
dc.titleText analysis for email multi label classificationsv
dc.title.alternativeText analysis for email multi label classificationsv
dc.typetext
dc.type.degreeStudent essay
dc.type.uppsokH2

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
gupea_2077_65588_1.pdf
Size:
2.03 MB
Format:
Adobe Portable Document Format
Description:
Master thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
876 B
Format:
Item-specific license agreed upon to submission
Description:

Collections