Creating Synthetic Dialogue Datasets for NLU Training. An Approach Using Large Language Models
| dc.contributor.author | Laszlo, Bogdan | |
| dc.contributor.department | University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science | eng |
| dc.contributor.department | Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori | swe |
| dc.date.accessioned | 2024-06-20T17:06:07Z | |
| dc.date.available | 2024-06-20T17:06:07Z | |
| dc.date.issued | 2024-06-20 | |
| dc.description.abstract | This thesis explores the topic of using the GPT-4 large language model, to generate high-quality, diverse synthetic dialogue datasets for training Natural Language Understanding (NLU) models in task-oriented dialogue systems. By employing a schema-guided framework and prompt engineering, the study explores whether synthetic data can replace real-world data. The research focuses on domain classification, active intent classification, and slot multi-labelling. Results show that while synthetic datasets can moderately match real-world data, issues like quality and annotation inconsistency persist. | sv |
| dc.identifier.uri | https://hdl.handle.net/2077/81885 | |
| dc.language.iso | eng | sv |
| dc.setspec.uppsok | HumanitiesTheology | |
| dc.subject | Language Technology | sv |
| dc.title | Creating Synthetic Dialogue Datasets for NLU Training. An Approach Using Large Language Models | sv |
| dc.title.alternative | Creating Synthetic Dialogue Datasets for NLU Training. An Approach Using Large Language Models | sv |
| dc.type | Text | |
| dc.type.degree | Student essay | |
| dc.type.uppsok | H2 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Creating Synthetic Dialogue Datasets for NLU Training_Revised.pdf
- Size:
- 1.06 MB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 4.68 KB
- Format:
- Item-specific license agreed upon to submission
- Description: