Automated Metadata Extraction for Job Advertisements

Strauss, Evelina; Safdar, Usama

Abstract

This thesis is written in collaboration with the Swedish Public Employment Service and aims to investigate methods and techniques to automatically extract metadata from unstructured texts. The Swedish Public Employment Service collect job ads from different private job boards and these ads consist of a title and description and are thus of an unstructured format. Adding metadata to such job advertisements will allow individuals to search and filter ads posted on Platsbanken, the Swedish Public Employment Service’s website that advertises jobs. This is phrased as a classification problem where a job advertisement is classified into one of the following classes capturing different requirements: Education/No education, Experience/No experience, Driving license/No driving license and Fulltime/ Part-time. Three different classification models are implemented and tested: a baseline dictionary lookup, Support Vector Machine, and BERT. BERT achieves the highest accuracy for sub-problems Education (0.90) and Experience (0.81), while SVM achieves the highest accuracy for Driving license (0.89) and Work type (0.87).

Degree

Student essay

Date

2022-06-20

Author

Strauss, Evelina

Safdar, Usama

Keywords

Machine learning

NLP

text classification

computer

science

computer science

project

thesis

Language

eng

Metadata

Show full item record