Effect of prompt strategy on the results of Code Generation by LLMs

Wang, Yiyi

Effect of prompt strategy on the results of Code Generation by LLMs

Files

MLT_Master_Thesis_Yiyi_Wang-finalversion[14].pdf (827.86 KB)

Date

2025-06-19

Authors

Wang, Yiyi

Abstract

Abstract Large Language Models (LLMs) have made significant strides in automated code generation. For example, Github Copilot based on the CodeX model, is the first to generate complete functions directly from natural language descriptions. However, their output quality remains highly dependent on prompt design. This study systematically investigates how different prompt strategies impact generated code from LLMs and explores optimization strategies for prompt engineering. We conducted experiments using Google Gemini with a single task employing four prompt strategies: zero-shot, few-shot with examples, Chain-of-Thought (CoT), and Persona-enhanced prompts. Our findings reveal that progressively enriching the prompt from zero-shot to few-shot, then integrating CoT and Persona can significantly improve the syntactic correctness of the generated code. Additionally, we utilize a code generation benchmark (MBPP) to evaluate the Gemini and DeepSeek-R1 model using the pass@3 metric. This experiment yielded an overall pass@3 score of approximately 70.60% and 79.4% separately. Moreover, we compare our result of accuracy from DeepSeek-R1 with the existing work using other LLMs such as ChatGPT. Our experiment result of DeepSeek-R1 with 86.8% accuracy performs near to ChatGPT Plus, which is 87.5%. Therefore, we conclude that DeepSeek-R1 is on the leading groups in the existing LLMs for code generation ability. In conclusion, our results show improvements in the syntactic correctness of the model generations. These results underscore the critical role of prompt strategy and structure in enhancing LLMs code generation performance, providing a solid theoretical and experimental foundation for future research on more complex programming tasks, multi-model comparisons, and large-scale evaluations.

Keywords

Prompt Engineering, Large Language Model, Code Generation

URI

https://hdl.handle.net/2077/88122

Collections

Masteruppsatser / Master in Language Technology

Full item page

Effect of prompt strategy on the results of Code Generation by LLMs

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections