Breaking Barriers: Enhancing Universal Dependency Parsing for Amharic Advancing NLP for A Low-Resource Language
No Thumbnail Available
Date
2025-06-19
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study advances Amharic dependency parsing by expanding and refining the
existing Universal Dependencies (UD) Treebank (Seyoum, Miyao, and Mekonnen,
2018). As a morphologically rich and under-resourced language, Amharic poses
unique challenges in natural language processing (NLP), particularly in syntactic and
morphological parsing. Leveraging the UD framework and the transformer-based
toolkit, Trankit, this work achieves improved parsing accuracy, outperforming the
results obtained with UDPipe and Turku models by Seyoum, Miyao, and Mekonnen
(2020) across multiple evaluation metrics. This result demonstrates that dataset
augmentation, coupled with rigorous syntactic validation, can substantially enhance
parsing performance and offer a scalable pathway for NLP development in lowresource
languages.
Description
Keywords
Language Technology Keywords: Amharic, Universal Dependencies, Low-Resource Language, Tokenization, Dependency Parsing, Treebank Expansion, Natural Language Processing