Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise

dc.citation.doiITF
dc.contributor.authorAl-Sabbagh, Khaled
dc.date.accessioned2023-08-22T06:09:28Z
dc.date.available2023-08-22T06:09:28Z
dc.date.issued2023-08-22
dc.description.abstractBackground: Modern software development companies are increasingly implementing continuous integration (CI) practices to meet market demands for delivering high-quality features. The availability of data from CI systems presents an opportunity for these companies to leverage machine learning to create methods for optimizing the CI process. Problem: The predictive performance of these methods can be hindered by inaccurate and irrelevant information – noise. Objective: The goal of this thesis is to improve the effectiveness of machine learning-based methods for CI by handling noise in data extracted from source code. Methods: This thesis employs design science research and controlled experiments to study the impact of noise-handling techniques in the context of CI. It involves developing ML-based methods for optimizing regression testing (MeBoTS and HiTTs), creating a taxonomy to reduce class noise, and implementing a class noise-handling technique (DB). Controlled experiments are carried out to examine the impact of class noise-handling on MeBoTS’ performance for CI. Results: The thesis findings show that handling class noise using the DB technique improves the performance of MeBoTS in test case selection and code change request predictions. The F1-score increases from 25% to 84% in test case selection and the Recall improved from 15% to 25% in code change request prediction after applying DB. However, handling attribute noise through a removal-based technique does not impact MeBoTS’ performance, as the F1-score remains at 66%. For memory management and complexity code changes should be tested with performance, load, soak, stress, volume, and capacity tests. Additionally, using the “majority filter” algorithm improves MCC from 0.13 to 0.58 in build outcome prediction and from -0.03 to 0.57 in code change request prediction. Conclusions: In conclusion, this thesis highlights the effectiveness of applying different class noise handling techniques to improve test case selection, build outcomes, and code change request predictions. Utilizing small code commits for training MeBoTS proves beneficial in filtering out test cases that do not reveal faults. Additionally, the taxonomy of dependencies offers an efficient and effective way for performing regression testing. Notably, handling attribute noise does not improve the predictions of test execution outcomes.en
dc.gup.defencedate2023-09-18
dc.gup.defenceplaceLindholmen Science Park, Room Tesla, Monday September 18th 2023, kl. 13:00en
dc.gup.departmentDepartment of Computer Science and Engineering ; Institutionen för data- och informationstekniken
dc.gup.mailkhaled.al-sabbagh@gu.seen
dc.gup.originUniversity of Gothenburg, IT Facultyen
dc.identifier.isbn978-91-8069-362-2
dc.identifier.urihttps://hdl.handle.net/2077/77272
dc.language.isoengen
dc.relation.haspartAl Sabbagh, K., Staron, M., Hebig, R., & Meding, W. (2019). Predicting Test Case Verdicts Using TextualAnalysis of Commited Code Churns. In CEUR Workshop Proceedings (Vol. 2476, pp. 138-153).en
dc.relation.haspartAl-Sabbagh, K. W., Hebig, R., & Staron, M. (2020, November). The effect of class noise on continuous test case selection: A controlled experiment on industrial data. In International Conference on Product-Focused Software Process Improvement (pp. 287-303). Cham: Springer International Publishing.en
dc.relation.haspartAl-Sabbagh, K. W., Staron, M., & Hebig, R. (2022). Improving test case selection by handling class and attribute noise. Journal of Systems and Software, 183, 111093.en
dc.relation.haspartAl-Sabbagh, K., Staron, M., Hebig, R., & Gomes, F. (2021, August). A classification of code changes and test types dependencies for improving machine learning based test selection. In Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 40-49).en
dc.relation.haspartAl-Sabbagh, K. W., Staron, M., & Hebig, R. (2022, November). Improving Software Regression Testing Using a Machine Learning-Based Method for Test Type Selection. In International Conference on Product-Focused Software Process Improvement (pp. 480-496). Cham: Springer International Publishing.en
dc.relation.haspartAl-Sabbagh, K., Staron, M., & Hebig, R. (2022, November). Predicting build outcomes in continuous integration using textual analysis of source code commits. In Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 42-51).en
dc.relation.haspartAl-Sabbagh, K., Staron, M., Habit, R. (2023, June). Submitted to ACM Transactions on Software Engineering and Methodology. The Impact of Class Noise-handling on the Effectiveness of Machine Learning-based Methods for Build Outcome and Code Change Request Predictionsen
dc.subjectContinuous Integrationen
dc.subjectNoise in software programsen
dc.subjectNoise-handlingen
dc.subjectSoftware regression testingen
dc.subjectCode change requestsen
dc.subjectBuild predictionen
dc.titleImproving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noiseen
dc.typeText
dc.type.degreeDoctor of Philosophyen
dc.type.svepDoctoral thesis

Files

Original bundle

Now showing 1 - 4 of 4
No Thumbnail Available
Name:
Khaled_PhD_thesis.pdf
Size:
14.6 MB
Format:
Adobe Portable Document Format
Description:
No Thumbnail Available
Name:
Spikblad_abstract.pdf
Size:
126.63 KB
Format:
Adobe Portable Document Format
Description:
Abstract
No Thumbnail Available
Name:
PhD_thesis_without_papers.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
Description:
No Thumbnail Available
Name:
cover-phd-khaled-al-sabbagh.pdf
Size:
2.02 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.68 KB
Format:
Item-specific license agreed upon to submission
Description: