Masteruppsatser / Master in Language Technology
Permanent URI for this collectionhttps://gupea-staging.ub.gu.se/handle/2077/61848
Browse
Browsing Masteruppsatser / Master in Language Technology by Issue Date
Now showing 1 - 20 of 44
- Results Per Page
- Sort Options
Item Fast visual grounding in interaction(2019-10-04) Cano Santín, José Miguel; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriA big challenge for the development situated agents is that they need to be capable of grounding real objects of their enviroment to representations with semantic meaning, so they can be comunicated to human agents using the human language. de Graaf (2016) developed the KILLE framework, which is a static camerabased robot capable of learning objects and spatial relations from very few samples using image processing algorithms suitable for learning from few samples. However, this framework has a major shortcoming: the time needed to recognise an object increased greatly as the system learned more objects, which motivates us to design a more efficient object recognition module. The following project researches a way to improve object recognition of the same robot framework using a neural network approach suitable for learning from very few image samples: Matching Networks (Vinyals et al., 2016). Our work also investigates how transfer learning form large datasets could be used to improve the object recognition performance and to make learning faster, which are very important features for a robot that interacts online with humans. Therefore, we evaluate the performance of our situated agent with transfer learning from pre-trained models and different conversational strategies with a human tutor. Results show that the robot system is capable of training models really fast and gets very good object recognition performance for small domains.Item Grounding of names in directory enquiries dialogue. A corpus study of listener feedback behaviour(2019-10-16) Bondarenko, Anastasia; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriThis paper presents a new corpus of dialogues in the domain of directory enquiries. We describe its collection and annotation process and then analyse feedback strategies employed by the dialogue participants focusing mainly on the grounding instances in the context of transmission of names. We discuss our findings in regards to their implementation in dialogue systems as well as in comparison to previous corpus studies of feedback. Finally, we present a preliminary formalisation of the grounding process of names, using a finite-state approach to modelling grounding in dialogue proposed by Traum (1994).Item IMPLEMENTING PERCEPTUAL SEMANTICS IN TYPE THEORY WITH RECORDS (TTR)(2019-11-18) Matsson, Arild; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriType Theory with Records (TTR) provides accounts of a wide range of semantic and linguistic phenomena in a single framework. This work proposes a TTR model of perception and language. Utilizing PyTTR, a Python implementation of TTR, the model is then implemented as an executable script. Over pure Python programming, TTR provides a transparent formal specification. The implementation is evaluated in a basic visual question answering (VQA) use case scenario. The results show that an implementation of a TTR model can account for multi-modal knowledge representation and work in a VQA setting.Item Kille: Learning Objects and Spatial Relations with Kinect(2020-08-26) de Graaf, Erik; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriIn order for humans to have meaningful interactions with a robotic system, this system should be capable of grounding semantic representations to their real-world representations, learn spatial relationships and communicate using spoken human language. End users need to be able to query the system what objects it already has knowledge of, for more efficient learning. Such systems exist, but require large sample sizes, thus not allowing end users to teach the system more objects when needed. To overcome this problem, we developed a non-mobile system dubbed Kille, that uses a 3D camera, SIFT features and machine learning to allow a tutor to teach the system objects and spatial relations. The system is built upon the ROS (Robot Operating System) framework and uses Opendial software as a dialogue system, for which a ROS support was written as part of this project. We describe the hardware of the system, the software used and developed, and we evaluate its performance. Our results show that Kille performs well on small learning sets, considering the low sample size it uses to learn. In contrast to other approaches, we focus on learning by a tutor presenting objects and not by providing a dataset. Recognition of spatial relations works well, however no definitive conclusions can be drawn. This is largely due to the small number of participants and the subjective nature of spatial relations.Item CORPUS EXPLORATION AND DIALOGUE SYSTEM DESIGN FOR A VIRTUAL LIBRARIAN(2020-09-01) Li, Xiao; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriThis thesis is a part of the virtual librarian project for the City Library Gothenburg (Stadsbibliotek Göteborg), which is a public city library. The objective of the project is to develop a virtual librarian using machine learning and AI approaches to replace the current webchat solution to reduce the workload of human librarians and increase satisfaction among the patrons. This thesis offers a systematic approach for the development practice based on small existing corpora for small and middle-size institutions, in which resources, especially technical development resources, are limited. The methods take the workload off from the side of the principal1 significantly, using requirement analysis with a narrative interview; topic-session based annotation with expandable tag set without detailed annotation guidelines, which requires less linguistic pre-knowledge and training process; and intent identification through corpus analysis with the assignment of priorities. Furthermore, this thesis offers a classification of intents based on the patterns of system behavior, which simplifies the formation of a complete intent list. Since Rasa is the preliminarily prioritized platform for the implementation of the virtual librarian, this thesis also engages a short competitive product analysis of the dialogue systems in the Rasa showcase. In the end, some technical suggestions for Rasa implementation are given, reflecting the requirements from the City Library Gothenburg.Item Determining linguistic predictor for the classification of subjective cognitive impairment and mild cognitive impairment using machine learning(2020-09-01) Wang, Tian; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriIntroduction Mild Cognitive Impairment (MCI) is a neurological condition characterized by cognitive decline greater than expected for an individual's age and education level. Subjective Cognitive Impairment (SCI) is a selfreported decline in cognitive abilities but not clinically identified as MCI. Individuals with MCI remain functional in their daily activities (Petersen et al., 1999) and are characterized by different deterioration rates depending on the evaluation methods employed. More than 50% of these individuals will develop Alzheimer’s Disease (AD) within the following five years; however several will remain stable and never develop AD (Gauthier et al., 2006; Petersen et al., 1999; Petersen et al., 2017). Although, there is no cure for AD, the early identification of individuals with MCI can enable treatments to delay the progression of the condition (Zucchella et al., 2018). Therefore, it is of paramount importance, to develop reliable objective diagnostic methods of cognitive impairment that can be conducted at primary care centers and memory clinics to determine whether an individual should seek further professional advice. Methodology 90 individuals participated in the study. 23 SCI patients, 31 MCI patients and 36 healthy controls (HC) enrolled in the study. All participants were between 50 to 79 years old; had Swedish as their first and only language before 5 years old; had similar length of education; had no stroke or brain tumor; and had recent neuropsychological test results available for assessment. Connected speech data were elicited from cookietheft picture description task (Goodglass & Kaplan, 1983), a standardized test employed in language therapy and evaluation sessions. Participants were recorded and the recordings were manually transcribed into text. The study refined the transcriptions of the recordings, defined several linguistic features, and employed two different annotation tools (Sparv and Parsey Universal) and two statistical measurements (Accuracy and Area under the Receiver Operating Characteristic (ROC AUC)) to select the superior feature set for the classification tasks. As a side product, an open source Swedish text annotation tool was deployed to benefit the linguistic research community. A novel feature engineering approach called SVCRandomized Recursive Feature Elimination (SVC-RRFE) was introduced to select best features using Support Vector Machine, binary search and group k-fold cross validation. In the end, the 160, 150 and 98 selected features were applied and evaluated in feed-forward neural networks using group 10-fold crossvalidation. Results Through group 10-fold cross validation neural networks (NN), we reached 76% mean accuracy, 73% mean ROC AUC, 0.47 mean Matthew’s correlation coefficient for MCI detection; 71% mean accuracy, 71% mean ROC AUC, 0.4 mean Matthew’s correlation coefficient for SCI detection and 75% mean accuracy, 71% mean ROC AUC, 0.39 mean Matthew’s correlation coefficient to differentiate MCI speakers and SCI speakers. The highest validation accuracy for the three models were 83%, 79% and 84%, respectively. The best features to classify MCI individuals and HC were mean length of word, words begin with [mɐ] and words with [ɪp] at the second and third position; the top 3 most important features to identify SCI individuals and HC were words with [ɑːɡ] at the second and third position, words begin with [mɑː] and words begin with [jøː]; and MLU, words begin with [dɛ], words with [ɑːd] at the second and third position were the most important features to differentiate MCI and SCI individuals. 3 Discussions Phonology was impaired in patients with MCI and SCI subject. Specifically, Individuals with MCI showed more self-interruptions, produced more long vowels than the ones with SCI, more unrounded vowels than rounded ones and more stops follow by back vowels during the picture description task. Individuals with SCI tend to produce longer utterances than HC and MCI ones, and more nasal consonants follow by close front vowels. Sparv annotated data performed better during feature selection and the ones analyzed by Parsey Universal reached better results with neural networks. It proved that feed-forward neural networks can be used to build models to identify people with MCI and people with SCI. By employing phonological features this study provided improved classification of individuals with MCI, provided added objective markers than can be employed to identify these individuals for treatment.Item An Experimental Evaluation of Grounding Strategies for Conversational Agents(2020-09-11) Zou, Yiqian; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriWith the continuous development of technology, dialogue system’s technology penetrates into human’s life. Grounding also becomes more and more important for dialogue systems. It is important to choose a suitable grounding strategy in a conversational agent. Two grounding strategies are compared in this article, explicit feedback and implicit feedback. The explicit feedback in this article is different from interrogative explicit feedback. It has been modified to make a system says ”Ok, x” in response to utterance x. The aim of this paper is to compare two grounding strategies and to find out which one is better. Additionally, how users respond to false feedback is also the research question in this article. In order to draw a conclusion, a dialogue system was implemented. This article uses a mix of quantitative method and qualitative method. Questionnaires are used to investigate the subjective judgments of participants. Participants evaluated the dialogue system through questionnaires. In the questionnaire, users rate the system from two aspects, naturalness and ease. From June 8th to 14th, the system was officially available. The data were analyzed by t-test and the result was presented in this article with diagrams. Most participants mentioned that they prefer the system with explicit feedback in the evaluation. According to the average score, the system with explicit feedback in this paper is more natural and easier to communicate than the system with implicit feedback. However, there is no significant difference between these two grounding strategies according to the results of the T-test. This does not mean that there are no differences, but that such differences may not be obvious because of the little sample size. In addition, user’s response to the wrong feedback is summarized in this article. Four kinds of reactions are described in this article, hesitation, repetition, point out the wrong feedback and correction.Item WANNA BE ON TOP? The Hyperparameter Search for Semantic Change's Next Top Model(2021-09-22) Viloria, Kate; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriLexical semantic change (LSC) detection through the use of diachronic corpora and computational methods continues to be a prevalent research area in language change (Tahmasebi et al., 2018). However, there has not yet been (to the best of our knowledge) extensive work further examining the models being trained and creating a foundation for what hyperparameter settings yield the best results. In this thesis, a large-scale hyperparameter search is conducted using the SemEval-2020 Task 1 dataset that includes English, German, Swedish, and Latin. Alongside model hyperparameters, different algorithms (Word2Vec and FastText) and alignment methods (Orthogonal Procrustes and Incremental Training) were also included. The hyperparameters evaluated are: number of training epochs, vector dimension, frequency threshold, and shared vocabulary size for the Orthogonal Procrustes alignment method. By amalgamating all of the results and assessing how model performance is affected if one hyperparameter is changed, considerations that must be made before training a model were substantiated. This research concludes that improvements in performance significantly decreases after 50 epochs during training and that the typical choice of 300 dimensions for vectors (based on English best practices in NLP) does not necessarily apply to other languages. It is also shown that choices in vector dimension, frequency threshold, and shared vocabulary size depend on the language in question, corpus size, and text genre composition.Item SPEECH SYNTHESIS AND RECOGNITION FOR A LOW-RESOURCE LANGUAGE Connecting TTS and ASR for mutual benefit(2021-09-23) Makashova, Liliia; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriSpeech synthesis (text-to-speech, TTS) and speech recognition (automatic speech recognition, ASR) are the NLP technologies that are the least available for low-resource and indigenous languages. Lack of computational and data resources is the major obstacle when it comes to the development of linguistic tools for these languages. We present a framework that does not require enormous GPU and target data resources, as well as guarantees reasonably good results in performance for the end-product. In this work we perform dual connection between TTS and ASR models and make them learn from each other in a low-resource setup. This project, being the first open-source implementation of such a bidirectional algorithm, leverages the power of open-source projects for the benefit of indigenous languages. We release the first ever functioning ASR tool for the North Sámi language along with a competitive TTS technology, which fulfills the demand of the North Sámi community and globally contributes to the further development of AI tools for low-resource languages.Item RESPONSIBLE WOMEN AND ANALYTICAL MEN Developing Swedish Gendered Lexica for Detection of Gender Bias in Job Advertisements(2021-09-27) Hansson, Saga; Mavromatakis, Konstantinos; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriIn this research, we examine gender bias in Swedish job advertisements with the use of gendered lexica. Gendered lexica can be employed to ascertain whether job advertisements are written with words that are associated with more masculine or feminine traits. The main purpose of this study is to investigate translating English gendered lexica using different machine translation methods: Google Translate, frequency-based and word-embedding-based. In the absence of a gold standard, we evaluated the translations by conducting quantitative and qualitative experiments. The embedding-based translation was evaluated as the most consistent method for the development of gendered lexica. Further testing of the embedding-based lexicon showed that Swedish job advertisements seem to be written with more feminine coded words, regardless of the gender of the majority of the workers in the advertised occupation. Advertisements for technical universities, specifically, tend to be written with more masculine coded words, while advertisements for universities that offer a wider range of education contain more feminine coded words.Item EMBODIED QUESTION ANSWERING IN ROBOTIC ENVIRONMENT Automatic generation of a synthetic question-answer data-set(2021-11-12) Aruqi, Ali; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriEmbodied question answering is the task of asking a robot about objects in a 3D environment. The robot has to navigate the environment, find the entities in question, and then stop to answer the question. The answering system consists of navigation and visual-question-answering components. The agent is trained on a synthetic data-set of question-answers and navigational paths called EQA-MP3D. Each question in the data-set is an executable function that could be run in the environment to yield an answer. EQA-MP3D includes only two types of questions, color and location questions. The type of questions asked could be considered unnatural, and we observe that the question-answers contain biases. Our work extends the data-set by automatically generating size and spatial questions. We generate a total of 19 207 question-answers for training and 3 186 question-answers for validation. Our data extension is intended to train the system to answer more question types and enhance the system’s overall ability to perform the task.Item Prosody and emotion: Towards the development of an emotional agent Emotional evaluation of news reports: production and perception experiments(2022-01-20) Tumma, Liina; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriThere is a recognised need for more research on the topic of emotion recognition from speech, and clear and defined methodology in this area is still lacking. Most studies in the field of emotional speech recognition and classification usually focus on acted speech as the data source; consequently, other methods that capture more natural speech are left aside. This study presents a novel perspective on corpus collection and emotion classification technique. The emotion from the perspective of an evaluation device is also highlighted. The aim of the study is to investigate the possibility to evoke happy, neutral and sad emotions from news reports, and to analyse the acoustic predictors that play a crucial role in the prediction of these emotions. The thesis is based on three experiments: i. corpus collection by eliciting sad, happy and neutral emotional speech through news, and posterior statistical analysis of this data (Mixed-effect models); ii. automatic classification of these emotions by training Decision Tree (C5.0) classification models; iii. perception experiment to verify the findings from the previous experiments. Speech data obtained from 20 native speakers of Swedish is analysed. The participants were asked to summarize and give their personal opinion on 36 news reports about happy and sad events and read out loud 12 neutral Wikipedia short descriptions. To investigate emotion as an evaluation device, sad news reports are categorized following the Brandt Line division between Global North (developed countries) and Global South (developing countries). Results indicate that news reports are suitable to be used as stimuli to evoke emotional responses of Swedish speakers. Decision Tree (DT) classifier reached an average accuracy of 70.88% (tested on validation data from 10-fold cross-validation). Final velocity, relative location of the F0 peak, time of the F0 peak and mean intensity are crucial attributes for the classifier. The perception experiment has also proved that Swedish speakers are capable of identifying and classifying these emotions, although machine learning outperforms the human evaluation. The findings do not show any clear difference between South and North news reports and therefore no evidence regarding emotion as an evaluation device in case of South and North news is found. The findings can contribute to a better understanding of evaluation as a speech device and it also explores other possibilities regarding corpus collection and classification methods, such as using news reports as emotion stimuli and a Decision Tree algorithm for classification. The research results represent a further step towards developing an emotional agent.Item NEURAL MACHINE TRANSLATION FROM NORTH SÁMI TO SWEDISH(2022-03-08) Pfau, Merle; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriNeural machine translation is a method used in automatic translation that makes use of artificial neural networks. A single model takes an input sequence and predicts the most likely output sequence of words after being trained on parallel data. In this master thesis, a neural machine translation model for the language pair North Sámi - Swedish was developed and trained. Since no parallel corpus exists between the two languages, a data set of Norwegian and North Sámi of about 225.000 sentences was translated to Swedish and used as training data. The model architecture is based on Vaswani et al. (2017)’s transformer, which is the state-of-the-art approach, if enough parallel data is available. Following Sennrich et al. (2016)’s techniques of combining methods to lower the amount of necessary data, a BLEU score of 44.11 was achieved. Due to the relatively small amount of available parallel data, techniques of incorporating monolingual bitext and creating synthetic additional data were implemented, but did not result in any further improvements.Item MULTI-CLASS GRAMMATICAL ERROR DETECTION Data, Benchmarks and Evaluation Metrics for the First Shared Task on Swedish L2 Data(2022-06-20) Casademont Moner, Judit; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriGrammatical Error Detection (GED) is a challenging NLP task that has not received a lot of research attention in the recent years, especially in the Swedish language. However, in the world we live in, where there are more L2 (second language) learners than there have ever been, educational resources for students such as tools for grammar checking are needed. With this in mind, this Master’s thesis presents the generation process of the Swedish MuClaGED (Multi-Class Grammatical Error Detection) dataset, which is going to be part of a Computational SLA (Second Language Acquisition) shared task and it will likely be useful for the future production of multilingual grammatical error detection systems. Once Swedish MuClaGED is obtained in this thesis, two main experiments are performed on it to test its capabilities and obtain baseline results in preparation for the aforementioned shared task. Moreover, this project also aims to tackle and explore the advantages, disadvantages and functionalities of the creation of hybrid error detection datasets by experimenting on producing GED models trained on the combination of original L2 learners’ data with text corrupted with artificially generated syntactical errors.Item THE LINGUISTIC STRUCTURE OF WIKIPEDIA A multilingual analysis and comparison of the language used in Wikipedia articles(2022-06-20) Grau Francitorra, Patricia; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriWikipedia is a great source of knowledge, but due to its open-collaboration nature, it presents some limitations. Namely, the uneven distribution of content, the low overlap in topic coverage, the differences in the comprehensiveness of articles, and the low number of editors. For this reason, the Abstract Wikipedia project has been created; their objective is to construct language-independent (abstract) articles that can be rendered in any language. In this thesis, we have computationally analysed the language used in Wikipedia in order to find similarities between the language used in different articles. To do so, we have syntactically parsed articles of Wikipedia in different languages using UDPipe 2.0 and gathered the languages’ recurrent syntactic patterns using Grammatical Framework’s GF-UD. Then, we have compared the analyses with cosine similarity in two ways: based on dependency relations and based on linguistic patterns. We have seen that there is a basis for the Abstract Wikipedia project: there are syntactic similarities not only within one language, but also within multiple languages. In addition, we have found that semantically-related topics have a higher similarity than those which are not. Finally, we have gathered syntactic patterns of every language and compared them, which can constitute the basis of the creation of the Renderers for Abstract Wikipedia.Item LAUGHTER PREDICTION IN TEXT BASED DIALOGUES Predicting Laughter using Transformer-Based Models(2022-06-20) Kumar Battula, Hemanth; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriIn this paper we will attempt to predict and assess the performance of predicting laughter using a BERT model (Devlin et al., 2019), and a BERT model finetuned on the Open subtitles dataset with and without considering dialogue-acts classes as well as sliding window of dialogues. We hypothesize that fine tuning a BERT on the open subtitles might increase the performance. Our results will be compared with those of Maraev et al., 2021a paper which show predicting actual laughs in dialogue and address it with various deep learning models, namely recurrent neural network (RNN), convolution neural network (CNN) and combinations of these. The Switchboard dialogue Act Corpus (SWDA), Jurafsky et al., 1997a) (US English, phone conversations where two participants that are not familiar with each other discuss a potentially controversial subject, such as gun control or the school system) is processed first in the project to make it appropriate for the BERT model. We then analyze dialogue acts within the Switchboard Dialogue Act Corpus with their collocation with laughter and supply some qualitative insights. SWDA is tagged with a collection of 220 dialogue act tags which, following Jurafsky et al. (1997b), we cluster into a smaller set of 42 tags. The major purpose of this research is to show that a BERT model would outperform the Convolution Neural Network (CNN) and Recurrent Neural Network (RNN) models presented in the IWSDS publication.Item EVALUATING CONFIDENCE ESTIMATION IN NLU FOR DIALOGUE SYSTEMS(2022-06-20) Khojah, Ranim; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriBackground: Natural Language Understanding (NLU) is an important component in Dialogue Systems (DS) which makes the utterances of humans understandable by machines. A central aspect of NLU is intent classification. In intent classification, an NLU receives a user utterance, and outputs a list of N ranked hypotheses (an N-best list) of the predicted intent along with a confidence estimation (a real number between 0 and 1) that is assigned to each hypothesis. Objectives: In this study, we perform an in-depth evaluation of the confidence estimation of 5 NLUs, namely Watson Assistant, Language Understanding Intelligent Service (LUIS), Snips.ai and Rasa in two different configurations (Sklearn and DIET). We measure the calibration on two levels: rank level (results for specific ranks) and model level (aggregated results across ranks), as well as the performance on a model level. Calibration here refers to the relation between confidence estimates and true likelihood, i.e. how useful the confidence estimate associated with a certain hypothesis is for assessing its likelihood of being correct. Methodology: We conduct an exploratory case study on the NLUs. We train the NLUs using a subset of a multi-domain dataset proposed by Liu et al. (2021) on intent classification tasks. We assess the calibration of the NLUs on model- and rank levels using reliability diagrams and correlation coefficient with respect to instance-level accuracy, while we measure the performance through accuracy and F1-score. Results: The evaluation results show that on a model level, the best calibrated NLU is Rasa-Sklearn and the least calibrated NLU is Snips, while Watson surpasses other NLUs as the best performing NLU and Rasa-Sklearn as the worst performing NLU. The rank-level results resonate with the model-level results. However, on lower ranks, some measures become less informative due to low variation of the confidence estimates. Conclusion: Our findings convey that when choosing an NLU for a dialogue system, there is a trade-off between calibration and performance, that is, a well-performing NLU is not necessarily well-calibrated, and vice versa. While the chosen metrics of calibration is clearly useful, we also note some limitations and conclude that further investigation is needed to find the optimal metric of calibration. Also, it should be noted that to some extent, our results rest on the assumption that the chosen metrics of calibration is suitable for our purposes.Item DIALOGUE STRATEGIES FOR VOCABULARY LEARNING User Initiative in Dialogue Systems for Second Language Learning(2022-10-06) Carrión del Fresno, Andrea; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriWhen building efficient dialogue systems, a major challenge is recovering from miscommunication. Analyzing human-human interaction leads to discovering repair strategies that contribute towards conversational systems able to communicate in a natural and effective way. This thesis aims to identify recurring dialogue strategies (conversational patterns) commonly used among second language (L2) learners when acquiring new vocabulary by means of analyzing second language learner corpora. We further provide a simple theoretical model along with an implementation thereof capable of reproducing the most frequent patterns observed in our data and later embedded in a vocabulary training activity designed for the second language classroom. We found instances of production problems and code-switching taking place together caused by a poor linguistic competence in the target language. We showed that learners ask (either explicitly or implicitly) for the L2 word/expression they need and, once it is provided, learners repeat it as part of the strategy for acquiring new L2 vocabulary. We believe the findings of this thesis can be of value to dialogue systems for second language learning. Future work includes an extended implementation and exploring larger amounts of data.Item GO BACK TO /R/CONSPIRACY: AN EXPLORATION OF METHODS FOR THE AUTOMATIC DETECTION OF AFFECTIVE POLARIZATION ON REDDIT(2022-10-06) Båstedt, Klara; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriAffective polarization – the tendency to hold negative attitudes towards an out-group and biased, positive attitudes towards an in-group – is a hot topic in research and public debate. There are concerns that news media’s tendency to focus on political conflict rather than issues is causing polarization to increase, but researchers lack methods to automatically asses levels of polarization in online debates and correlate them with news articles. This study examines the appropriateness of using Reddit Karma, word embeddings and existing NLP tools for automatic detection of affective polarization in discussions on Reddit. To achieve this, we collect and manually annotate Reddit discussions for expressions of affective polarization and fit multiple logistic regression models on the discussion features and metadata. We find a strong correlation between the probability to encounter expressions of affective polarization in the data and both word embeddings and the confidence scores of toxicity detection. We also find that patterns in the comment votes are good predictors of disagreement in the discussions. Moreover, we present a data set of Reddit-discussions about topics related to the covid-19 pandemic which can be used in further attempts to automatically detect affective polarization in interactive discourse on social media.Item EVALUATING THE EXTENT OF ETHNIC BIASES IN FINBERT AND EXPLORING DEBIASING TECHNIQUES(2022-10-07) Suvanto, Minerva; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteoriLanguage models are becoming increasingly popular. These models can contain social biases about various groups of people in them. The reproduction of biased beliefs can have harmful impacts on the groups they are about. We explore the extent of ethnic biases in the Finnish language model FinBERT. Our work focuses on biases about minority groups in Finland and we evaluate the extent of biases in the ethnic groups Roma, Finnish-Swedish, Sámi, Somali and Russian. In order to quantify the extent of biases, we use a template-based approach of calculating association scores between ethnicities and biased terms. We find that the model produces biased outcomes about the minority groups Roma and Somali. In order to mitigate the detected biases, we attempt debiasing FinBERT using dropout regularization and self-debiasing. The results of these two debiasing techniques do not produce satisfactory results and we conclude that debiasing ethnic biases and Finnish language models requires further research.
- «
- 1 (current)
- 2
- 3
- »