Effect of prompt strategy on the results of Code Generation by LLMs

Wang, Yiyi

Effect of prompt strategy on the results of Code Generation by LLMs

dc.contributor.author	Wang, Yiyi
dc.contributor.department	University of Gothenburg / Department of Philosophy,Lingustics and Theory of Science	eng
dc.contributor.department	Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori	swe
dc.date.accessioned	2025-06-19T19:14:14Z
dc.date.available	2025-06-19T19:14:14Z
dc.date.issued	2025-06-19
dc.description.abstract	Abstract Large Language Models (LLMs) have made significant strides in automated code generation. For example, Github Copilot based on the CodeX model, is the first to generate complete functions directly from natural language descriptions. However, their output quality remains highly dependent on prompt design. This study systematically investigates how different prompt strategies impact generated code from LLMs and explores optimization strategies for prompt engineering. We conducted experiments using Google Gemini with a single task employing four prompt strategies: zero-shot, few-shot with examples, Chain-of-Thought (CoT), and Persona-enhanced prompts. Our findings reveal that progressively enriching the prompt from zero-shot to few-shot, then integrating CoT and Persona can significantly improve the syntactic correctness of the generated code. Additionally, we utilize a code generation benchmark (MBPP) to evaluate the Gemini and DeepSeek-R1 model using the pass@3 metric. This experiment yielded an overall pass@3 score of approximately 70.60% and 79.4% separately. Moreover, we compare our result of accuracy from DeepSeek-R1 with the existing work using other LLMs such as ChatGPT. Our experiment result of DeepSeek-R1 with 86.8% accuracy performs near to ChatGPT Plus, which is 87.5%. Therefore, we conclude that DeepSeek-R1 is on the leading groups in the existing LLMs for code generation ability. In conclusion, our results show improvements in the syntactic correctness of the model generations. These results underscore the critical role of prompt strategy and structure in enhancing LLMs code generation performance, providing a solid theoretical and experimental foundation for future research on more complex programming tasks, multi-model comparisons, and large-scale evaluations.	sv
dc.identifier.uri	https://hdl.handle.net/2077/88122
dc.language.iso	eng	sv
dc.setspec.uppsok	HumanitiesTheology
dc.subject	Prompt Engineering, Large Language Model, Code Generation	sv
dc.title	Effect of prompt strategy on the results of Code Generation by LLMs	sv
dc.title.alternative	Effect of prompt strategy on the results of Code Generation by LLMs	sv
dc.type	Text
dc.type.degree	Student essay
dc.type.uppsok	H2

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MLT_Master_Thesis_Yiyi_Wang-finalversion[14].pdf
Size:: 827.86 KB
Format:: Adobe Portable Document Format
Description:: Master thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 4.68 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masteruppsatser / Master in Language Technology