Browsing by Author "Hakkarainen, Stanislav"

Now showing 1 - 1 of 1

Automatic Idiomatic Expression Detection. Comparison Between GPT-4 and Gemini Pro Prompt Engineering & LSTM-RNN Construction
(2024-06-18) Hakkarainen, Stanislav; Engelbrecht, Katharina; University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science; Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori
This thesis explores the concept of detecting non-literal phrases using Large Language Models (LLM) such as GPT-4 and Gemini Pro, as well as Recurrent Neural Networks (RNN), LSTM and BiLSTM models in particular. Through a series of individual experiments and cross-validations, it was discovered that both LLMs demonstrated satisfactory capabilities in identifying idiomatic expressions with degrees of variance across sentences. Additionally, it was observed that Gemini Pro slightly outperformed GPT-4 in the separate validation based on precision and recall. Gemini Pro scores highest for testing on 95% of precision and 81% of recall. GPT-4 scores highest for precision at 87% and for recall at 88%. During cross-validation, however, GPT-4 improved whereas Gemini Pro’s precision became worse. GPT-4 scored 88% for precision and 90% for recall, whereas Gemini Pro became worse for precision, scoring 83%, however improved for recall scoring 95%. In terms of RNN, the BiLSTM-RNN outperforms the LSTM-RNN in the idiomatic detection task by a significant margin by scoring 95% in precision and 90% in recall compared to its counterpart achieving 79% in precision and 25% in recall, proving that a bidirectional approach is better suitable for working with sequential data such as idiomatic expressions. To summarize, it has been shown that specialized model architectures such as LSTM modules are preferable when working in the domain of idiomatic expression detection to general-purpose LLMs.