MULTI-CLASS GRAMMATICAL ERROR DETECTION Data, Benchmarks and Evaluation Metrics for the First Shared Task on Swedish L2 Data

dc.contributor.authorCasademont Moner, Judit
dc.contributor.departmentUniversity of Gothenburg / Department of Philosophy,Lingustics and Theory of Scienceeng
dc.contributor.departmentGöteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriswe
dc.date.accessioned2022-06-20T09:20:23Z
dc.date.available2022-06-20T09:20:23Z
dc.date.issued2022-06-20
dc.description.abstractGrammatical Error Detection (GED) is a challenging NLP task that has not received a lot of research attention in the recent years, especially in the Swedish language. However, in the world we live in, where there are more L2 (second language) learners than there have ever been, educational resources for students such as tools for grammar checking are needed. With this in mind, this Master’s thesis presents the generation process of the Swedish MuClaGED (Multi-Class Grammatical Error Detection) dataset, which is going to be part of a Computational SLA (Second Language Acquisition) shared task and it will likely be useful for the future production of multilingual grammatical error detection systems. Once Swedish MuClaGED is obtained in this thesis, two main experiments are performed on it to test its capabilities and obtain baseline results in preparation for the aforementioned shared task. Moreover, this project also aims to tackle and explore the advantages, disadvantages and functionalities of the creation of hybrid error detection datasets by experimenting on producing GED models trained on the combination of original L2 learners’ data with text corrupted with artificially generated syntactical errors.en
dc.identifier.urihttps://hdl.handle.net/2077/72153
dc.language.isoengen
dc.setspec.uppsokHumanitiesTheology
dc.subjectGrammatical Error Detection, L2 Swedish dataset, synthetic data, shared tasken
dc.titleMULTI-CLASS GRAMMATICAL ERROR DETECTION Data, Benchmarks and Evaluation Metrics for the First Shared Task on Swedish L2 Dataen
dc.typeText
dc.type.degreeStudent essay
dc.type.uppsokH2

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Mutilclass-grammatical-error-detection_Judit.pdf
Size:
496.52 KB
Format:
Adobe Portable Document Format
Description:
Master thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.68 KB
Format:
Item-specific license agreed upon to submission
Description: