• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • View Item
  •   Home
  • Student essays / Studentuppsatser
  • Department of Philosophy,Lingustics and Theory of Science / Institutionen för filosofi, lingvistik och vetenskapsteori
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Natural Language Processing Model for Maltese Syntax

Natural Language Processing-modell för Maltesisk Syntax

Abstract
The objective of this thesis is to create a Natural Language Processing Model for the Maltese Language. The ultimate goal is that the model would be able to recognise syntactical features, that is the linguistic features and the relationship of a sequence of words, in Maltese. The performance and accuracy of the Maltese model is compared with the models of languages that have great influence on the Maltese language. The results outputted by the dependency parser were linguistically analysed to provide in depth analysis of the results outputted during training and testing. The model is tested on unseen text to provide a further understanding of the level of accuracy of the machine learning algorithm. For this syntax annotator, the model created is trained on manually annotated data and then the output is syntax data that is processed by the dependency parser and part-of- speech tagger. This model is made using the Python package spaCy. Since every language is unique, the linguistic rules are evaluated, to teach the model the rules of the language being researched. The MUDTv1 corpus developed by Slavomír Céplö for his Phd Thesis is used to train this model. The results show that the Maltese syntax model had a 91% part-of-speech tag accuracy, 74% unlabelled attachment score and 66% labelled attachment score. The model is further tested on unseen non-annotated text, the tag accuracy is 75% and the tokeniser accuracy is 99%.
Degree
Student essay
URI
http://hdl.handle.net/2077/69768
Collections
  • Magisteruppsatser/ Institutionen för filosofi, lingvistik och vetenskapsteori
View/Open
thesis (1.300Mb)
Date
2021-10-08
Author
Attard, Greta
Keywords
natural language processing
syntax
spaCy
universal dependency
dependency parser
part-of-speech tagger
maltese nlp pipeline
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV