dc.contributor.author | Al-Sabbagh, Khaled | |
dc.date.accessioned | 2023-08-22T06:09:28Z | |
dc.date.available | 2023-08-22T06:09:28Z | |
dc.date.issued | 2023-08-22 | |
dc.identifier.isbn | 978-91-8069-362-2 | |
dc.identifier.uri | https://hdl.handle.net/2077/77272 | |
dc.description.abstract | Background: Modern software development companies are increasingly implementing continuous integration (CI) practices to meet market demands for delivering high-quality features. The availability of data from CI systems presents an opportunity for these companies to leverage machine learning to create methods for optimizing the CI process. Problem: The predictive performance of these methods can be hindered by inaccurate and irrelevant information – noise. Objective: The goal of this thesis is to improve the effectiveness of machine learning-based methods for CI by handling noise in data extracted from source code. Methods: This thesis employs design science research and controlled experiments to study the impact of noise-handling techniques in the context of CI. It involves developing ML-based methods for optimizing regression testing (MeBoTS and HiTTs), creating a taxonomy to reduce class noise, and implementing a class noise-handling technique (DB). Controlled experiments are carried out to examine the impact of class noise-handling on MeBoTS’ performance for CI. Results: The thesis findings show that handling class noise using the DB technique improves the performance of MeBoTS in test case selection and code change request predictions. The F1-score increases from 25% to 84% in test case selection and the Recall improved from 15% to 25% in code change request prediction after applying DB. However, handling attribute noise through a removal-based technique does not impact MeBoTS’ performance, as the F1-score remains at 66%. For memory management and complexity code changes should be tested with performance, load, soak, stress, volume, and capacity tests. Additionally, using the “majority filter” algorithm improves MCC from 0.13 to 0.58 in build outcome prediction and from -0.03 to 0.57 in code change request prediction. Conclusions: In conclusion, this thesis highlights the effectiveness of applying different class noise handling techniques to improve test case selection, build outcomes, and code change request predictions. Utilizing small code commits for training MeBoTS proves beneficial in filtering out test cases that do not reveal faults. Additionally, the taxonomy of dependencies offers an efficient and effective way for performing regression testing. Notably, handling attribute noise does not improve the predictions of test execution outcomes. | en_US |
dc.language.iso | eng | en_US |
dc.relation.haspart | Al Sabbagh, K., Staron, M., Hebig, R., & Meding, W. (2019). Predicting Test Case Verdicts Using TextualAnalysis of Commited Code Churns. In CEUR Workshop Proceedings (Vol. 2476, pp. 138-153). | en_US |
dc.relation.haspart | Al-Sabbagh, K. W., Hebig, R., & Staron, M. (2020, November). The effect of class noise on continuous test case selection: A controlled experiment on industrial data. In International Conference on Product-Focused Software Process Improvement (pp. 287-303). Cham: Springer International Publishing. | en_US |
dc.relation.haspart | Al-Sabbagh, K. W., Staron, M., & Hebig, R. (2022). Improving test case selection by handling class and attribute noise. Journal of Systems and Software, 183, 111093. | en_US |
dc.relation.haspart | Al-Sabbagh, K., Staron, M., Hebig, R., & Gomes, F. (2021, August). A classification of code changes and test types dependencies for improving machine learning based test selection. In Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 40-49). | en_US |
dc.relation.haspart | Al-Sabbagh, K. W., Staron, M., & Hebig, R. (2022, November). Improving Software Regression Testing Using a Machine Learning-Based Method for Test Type Selection. In International Conference on Product-Focused Software Process Improvement (pp. 480-496). Cham: Springer International Publishing. | en_US |
dc.relation.haspart | Al-Sabbagh, K., Staron, M., & Hebig, R. (2022, November). Predicting build outcomes in continuous integration using textual analysis of source code commits. In Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 42-51). | en_US |
dc.relation.haspart | Al-Sabbagh, K., Staron, M., Habit, R. (2023, June). Submitted to ACM Transactions on Software Engineering and Methodology. The Impact of Class Noise-handling on the Effectiveness of Machine Learning-based Methods for Build Outcome and Code Change Request Predictions | en_US |
dc.subject | Continuous Integration | en_US |
dc.subject | Noise in software programs | en_US |
dc.subject | Noise-handling | en_US |
dc.subject | Software regression testing | en_US |
dc.subject | Code change requests | en_US |
dc.subject | Build prediction | en_US |
dc.title | Improving the Performance of Machine Learning-based Methods for Continuous Integration by Handling Noise | en_US |
dc.type | Text | |
dc.type.svep | Doctoral thesis | |
dc.gup.mail | khaled.al-sabbagh@gu.se | en_US |
dc.type.degree | Doctor of Philosophy | en_US |
dc.gup.origin | University of Gothenburg, IT Faculty | en_US |
dc.gup.department | Department of Computer Science and Engineering ; Institutionen för data- och informationsteknik | en_US |
dc.citation.doi | ITF | |
dc.gup.defenceplace | Lindholmen Science Park, Room Tesla, Monday September 18th 2023, kl. 13:00 | en_US |
dc.gup.defencedate | 2023-09-18 | |