Text analysis for email multi label classification
| dc.contributor.author | Paniskaki, Kyriaki | |
| dc.contributor.author | Harsha Kadam, Sanjit | |
| dc.contributor.department | Göteborgs universitet/Institutionen för data- och informationsteknik | swe |
| dc.contributor.department | University of Gothenburg/Department of Computer Science and Engineering | eng |
| dc.date.accessioned | 2020-07-08T11:30:41Z | |
| dc.date.available | 2020-07-08T11:30:41Z | |
| dc.date.issued | 2020-07-08 | |
| dc.description.abstract | This master’s thesis studies a multi label text classification task on a small data set of bilingual, English and Swedish, short texts (emails). Specifically, the size of the data set is 5800 emails and those emails are distributed among 107 classes with the special case that the majority of the emails includes the two languages at the same time. For handling this task different models have been employed: Support Vector Machines (SVM), Gated Recurrent Units (GRU), Convolution Neural Network (CNN), Quasi Recurrent Neural Network (QRNN) and Transformers. The experiments demonstrate that in terms of weighted averaged F1 score, the SVM outperforms the other models with a score of 0.96 followed by the CNN with 0.89 and the QRNN with 0.80. | sv |
| dc.identifier.uri | http://hdl.handle.net/2077/65588 | |
| dc.language.iso | eng | sv |
| dc.relation.ispartofseries | CSE 20-14 | sv |
| dc.setspec.uppsok | Technology | |
| dc.subject | natural language processing | sv |
| dc.subject | machine learning | sv |
| dc.subject | multi label text classification | sv |
| dc.subject | deep neural networks | sv |
| dc.subject | bilingual texts | sv |
| dc.subject | emails | sv |
| dc.subject | short texts | sv |
| dc.title | Text analysis for email multi label classification | sv |
| dc.title.alternative | Text analysis for email multi label classification | sv |
| dc.type | text | |
| dc.type.degree | Student essay | |
| dc.type.uppsok | H2 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- gupea_2077_65588_1.pdf
- Size:
- 2.03 MB
- Format:
- Adobe Portable Document Format
- Description:
- Master thesis
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 876 B
- Format:
- Item-specific license agreed upon to submission
- Description: