Argumentation and agreement: Annotating and evaluating Swedish corpora for argumentation mining
| dc.contributor.author | Lindahl, Anna | |
| dc.date.accessioned | 2025-09-12T07:03:23Z | |
| dc.date.available | 2025-09-12T07:03:23Z | |
| dc.date.issued | 2025-09-12 | |
| dc.description.abstract | Argumentation occurs in all parts of life, and as such, is studied across disciplines. In natural language processing, the field of argumentation mining aims to develop computational tools that automatically analyze and evaluate argumentation. Such tools have many uses, from automatically grading essays to identifying fallacies. In order to build such tools, annotated data is essential both for training and evaluating, especially with large language models (LLMs). Creating annotated datasets, however, presents significant challenges, not only because of the complexity of argumentation but also methodological questions such as how to represent argumentation and how to evaluate annotation quality. To create more resources as well as investigate these challenges, in this thesis, I explore several approaches to argumentation annotation. To this end, I also present a comprehensive survey of argumentation annotation. Three annotation approaches of varying complexity are explored: argumentation schemes applied to editorials, argumentative spans to online forums and political debates, and attitude annotation to tweets. The datasets thus represent a wide variety of genres and approaches. Attitude in tweets was found to show the highest agreement among annotators, while annotation of editorials with argumentation schemes was the most challenging. In the evaluation of the annotations, several types of disagreement were identified. Most saliently, disagreement often occurred in cases where multiple interpretations are possible, challenging agreement as the primary measure of quality. These findings demonstrate the need for more comprehensive evaluation approaches. I therefore demonstrate ways to evaluate beyond single agreement measures: agreement analysis from multiple angles, annotator pattern investigation, and manual inspection of disagreement. To further explore argumentation annotation, I investigate how two different LLMs annotate argumentation compared to human annotators, finding that while the models exhibit similar annotation behavior as humans, with similar agreement levels and disagreement patterns, the models agree more among themselves than human annotators. | sv |
| dc.gup.defencedate | 2025-10-07 | |
| dc.gup.defenceplace | Tisdagen den 7 oktober 2025, kl.13:15 i hörsal J330, Humanisten, Renströmsgatan 6 | sv |
| dc.gup.department | Department of Swedish, Multilingualism, Language Technology ; Institutionen för svenska, flerspråkighet och språkteknologi | sv |
| dc.gup.dissdb-fakultet | HF | |
| dc.gup.origin | Göteborgs universitet. Humanistiska fakulteten | swe |
| dc.gup.origin | University of Gothenburg. Faculty of Humanities | eng |
| dc.identifier.isbn | 978-91-8115-302-6 (tryckt) | |
| dc.identifier.isbn | 978-91-8115-303-3 (PDF) | |
| dc.identifier.uri | https://hdl.handle.net/2077/87769 | |
| dc.language.iso | eng | sv |
| dc.relation.haspart | 1. Anna Lindahl & Lars Borin. 2024. Annotation for computational argumentation analysis: issues and perspectives. Language and Linguistics Compass 18(1). e12505. DOI: https://doi.org/10.1111/lnc3.12505 | sv |
| dc.relation.haspart | 2. Anna Lindahl et al. 2019. Towards assessing argumentation annotation - a first step. In Proceedings of the 6th Workshop on Argument Mining, 177–186. Florence: ACL. DOI: 10.18653/v1/W19-4520 | sv |
| dc.relation.haspart | 3. Anna Lindahl. 2020. Annotating argumentation in Swedish social media. In Proceedings of the 7th Workshop on Argument Mining, 100–105. Online: ACL URL: https://aclanthology.org/2020.argmining-1.11/ | sv |
| dc.relation.haspart | 4. Anna Lindahl. 2025a. Annotating attitude in Swedish political tweets. In Špela Arhar Holdt et al. (eds.), Proceedings of the third workshop on resources and representations for under-resourced languages and domains (resourceful-2025), 106–110. Tallinn: University of Tartu Library URL: https://aclanthology.org/2025.resourceful-1.24/ | sv |
| dc.relation.haspart | 5. Anna Lindahl. 2024. Disagreement in argumentation annotation. In Gavin Abercrombie et al. (eds.), Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives), 56–66. Turin: ELRA & ICCL URL: https://aclanthology.org/2024.nlperspectives-1.6/ | sv |
| dc.relation.haspart | 6. Anna Lindahl. 2022. Do machines dream of artificial agreement? In Harry Bunt (ed.), Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022, 71–75. Marseille: European Language Resources Association URL: https://aclanthology.org/2022.isa-1.9/ | sv |
| dc.relation.haspart | 7. Anna Lindahl. 2025b. LLMs as annotators of argumentation | sv |
| dc.relation.ispartofseries | Data linguistica 33 | sv |
| dc.subject | natural language processing | sv |
| dc.subject | argumentation | sv |
| dc.subject | annotation | sv |
| dc.subject | argumentation mining | sv |
| dc.subject | annotation evaluation | sv |
| dc.subject | large language models | sv |
| dc.subject | machine learning | sv |
| dc.title | Argumentation and agreement: Annotating and evaluating Swedish corpora for argumentation mining | sv |
| dc.type | Text | |
| dc.type.degree | Doctor of Philosophy | sv |
| dc.type.svep | Doctoral thesis | eng |
Files
Original bundle
1 - 3 of 3
No Thumbnail Available
- Name:
- abstract.pdf
- Size:
- 80.19 KB
- Format:
- Adobe Portable Document Format
- Description:
- abstract
No Thumbnail Available
- Name:
- cover.pdf
- Size:
- 514.16 KB
- Format:
- Adobe Portable Document Format
- Description:
- cover
No Thumbnail Available
- Name:
- thesis.pdf
- Size:
- 1.6 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 4.68 KB
- Format:
- Item-specific license agreed upon to submission
- Description: