Automated Metadata Extraction for Job Advertisements
Abstract
This thesis is written in collaboration with the Swedish Public Employment Service
and aims to investigate methods and techniques to automatically extract metadata
from unstructured texts. The Swedish Public Employment Service collect job ads
from different private job boards and these ads consist of a title and description and
are thus of an unstructured format. Adding metadata to such job advertisements
will allow individuals to search and filter ads posted on Platsbanken, the Swedish
Public Employment Service’s website that advertises jobs.
This is phrased as a classification problem where a job advertisement is classified
into one of the following classes capturing different requirements: Education/No
education, Experience/No experience, Driving license/No driving license and Fulltime/
Part-time. Three different classification models are implemented and tested:
a baseline dictionary lookup, Support Vector Machine, and BERT. BERT achieves
the highest accuracy for sub-problems Education (0.90) and Experience (0.81), while SVM achieves the highest accuracy for Driving license (0.89) and Work type (0.87).
Degree
Student essay
Collections
View/ Open
Date
2022-06-20Author
Strauss, Evelina
Safdar, Usama
Keywords
Machine learning
NLP
text classification
computer
science
computer science
project
thesis
Language
eng