Q-Learning-Based Feature Selection for Software Defect Prediction

Mohammed Suham Ibrahim; Yasmin Makki Mohialden; Doaa Mohsin Abd Ali Afraji

doi:10.58564/IJSER.4.3.2025.320

Authors

Mohammed Suham Ibrahim Ministry of Higher Education and Scientific Research -Ministry Center, Iraq
Yasmin Makki Mohialden Computer Science Department, College of Science, Mustansiriyah University, Baghdad, Iraq
Doaa Mohsin Abd Ali Afraji Department of Computer Engineering, Universitat Politècnica de València, Valencia, Spain

DOI:

https://doi.org/10.58564/IJSER.4.3.2025.320

Keywords:

software defect prediction, Q-learning, reinforcement learning, feature selection, Random Forest, software quality assurance

Abstract

Software defect prediction (SDP) is essential for improving software reliability and reducing maintenance costs. In dynamic development environments, traditional static feature selection methods often fail to adapt to evolving data patterns. This study introduces a Q-learning–based adaptive feature selection approach, integrated with a Random Forest classifier, to enhance SDP performance. The method applies a reward-driven selection process during training, dynamically identifying the most relevant features.

Experiments were conducted on a real-world bug report dataset from Kaggle (136 instances, 6 features, ≈71% positive defect cases). Model performance was evaluated using accuracy, precision, recall, F1-score, and ROC–AUC. The proposed configuration achieved an accuracy of 10.71% and exhibited very low recall for minority classes, highlighting the strong impact of class imbalance. Comparative tests against conventional feature selection methods (e.g., ReliefF, mutual information) and alternative classifiers (e.g., SVM, Gradient Boosting) confirmed that the current approach underperforms state-of-the-art SDP models.

Despite this, the study demonstrates a reproducible framework for integrating reinforcement learning into feature selection for SDP and identifies key improvement areas, particularly in reward function design, imbalance handling, and dataset expansion. These findings provide a foundation for developing more adaptive, imbalance-resilient defect prediction systems in future research.

References

[1] Hesamolhokama, M., Shafiee, A., Ahmaditeshnizi, M., Fazli, M., & Habibi, J. (2024). SDPERL: A Framework for Software Defect Prediction Using Ensemble Feature Extraction and Reinforcement Learning. ArXiv, abs/2412.07927. https://doi.org/10.48550/arXiv.2412.07927.

[2] Ali, M., Mazhar, T., Arif, Y., Al-Otaibi, S., Ghadi, Y., Shahzad, T., Khan, M., & Hamam, H. (2024). Software Defect Prediction Using an Intelligent Ensemble-Based Model. IEEE Access, 12, 20376-20395. https://doi.org/10.1109/ACCESS.2024.3358201. DOI: https://doi.org/10.1109/ACCESS.2024.3358201

[3] Nashaat, M., & Miller, J. (2024). Refining software defect prediction through attentive neural models for code understanding. J. Syst. Softw., 220, 112266. https://doi.org/10.1016/j.jss.2024.112266. DOI: https://doi.org/10.1016/j.jss.2024.112266

[4] Nasser, A., Ghanem, W., Saad, A., Abdul-Qawy, A., Ghaleb, S., Alduais, N., Din, F., & Ghetas, M. (2024). Depth linear discrimination-oriented feature selection method based on adaptive sine cosine algorithm for software defect prediction. Expert Syst. Appl., 253, 124266. https://doi.org/10.1016/j.eswa.2024.124266. DOI: https://doi.org/10.1016/j.eswa.2024.124266

[5] Di Nucci, D., Palomba, F., De Rosa, G., Bavota, G., Oliveto, R., & De Lucia, A. (2018). A Developer Centered Bug Prediction Model. IEEE Transactions on Software Engineering, 44, 5-24. https://doi.org/10.1109/TSE.2017.2659747. DOI: https://doi.org/10.1109/TSE.2017.2659747

[6] Lan, K., Yang, K., Yu, Z., Han, G., You, J., & Chen, C. (2020). Adaptive Weighted Broad Learning System for software defect prediction.

[7] Matloob, F., Ghazal, T., Taleb, N., Aftab, S., Ahmad, M., Khan, M., Abbas, S., & Soomro, T. (2021). Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review. IEEE Access, 9, 98754-98771. https://doi.org/10.1109/ACCESS.2021.3095559. DOI: https://doi.org/10.1109/ACCESS.2021.3095559

[8] Mehta, S., & Patnaik, K. (2021). Improved prediction of software defects using ensemble machine learning techniques. Neural Computing and Applications, 33, 10551 - 10562. https://doi.org/10.1007/s00521-021-05811-3. DOI: https://doi.org/10.1007/s00521-021-05811-3

[9] Qiao, L., Li, X., Umer, Q., & Guo, P. (2020). Deep learning based software defect prediction. Neurocomputing, 385, 100-110. https://doi.org/10.1016/j.neucom.2019.11.067. DOI: https://doi.org/10.1016/j.neucom.2019.11.067

[10] Manjula, C., & Florence, L. (2018). Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Computing, 22, 9847-9863. https://doi.org/10.1007/s10586-018-1696-z. DOI: https://doi.org/10.1007/s10586-018-1696-z

[11] Wang, S., Liu, T., Nam, J., & Tan, L. (2020). Deep Semantic Feature Learning for Software Defect Prediction. IEEE Transactions on Software Engineering, 46, 1267-1293. https://doi.org/10.1109/TSE.2018.2877612. DOI: https://doi.org/10.1109/TSE.2018.2877612

[12] Song, Q., Guo, Y., & Shepperd, M. (2019). A Comprehensive Investigation of the Role of Imbalanced Learning for Software Defect Prediction. IEEE Transactions on Software Engineering, 45, 1253-1269. https://doi.org/10.1109/TSE.2018.2836442. DOI: https://doi.org/10.1109/TSE.2018.2836442

[13] Chen, J., Hu, K., Yang, Y., Liu, Y., & Xuan, Q. (2020). Collective transfer learning for defect prediction. Neurocomputing, 416, 103-116. https://doi.org/10.1016/J.NEUCOM.2018.12.091. DOI: https://doi.org/10.1016/j.neucom.2018.12.091

[14] Min, J., & Elliott, L. T. (2022). Q-learning with online random forests. arXiv preprint arXiv:2204.03771.

[15] Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292. https://doi.org/10.1007/BF00992698.

[16] Watkins, C., & Dayan, P. (2004). Technical Note: Q-Learning. Machine Learning, 8, 279-292. https://doi.org/10.1023/A:1022676722315. DOI: https://doi.org/10.1023/A:1022676722315

[17] Leng, L., Li, J., Zhu, J., Hwang, K., & Shi, H. (2021). Multi-Agent Reward-Iteration Fuzzy Q-Learning. International Journal of Fuzzy Systems, 23, 1669 - 1679. https://doi.org/10.1007/s40815-021-01063-4. DOI: https://doi.org/10.1007/s40815-021-01063-4

[18] Xu, R., Li, M., Yang, Z., Yang, L., Qiao, K., & Shang, Z. (2021). Dynamic feature selection algorithm based on Q-learning mechanism. Applied Intelligence, 51, 7233 - 7244. https://doi.org/10.1007/s10489-021-02257-x.

[19] Wang, H., Yang, F., & Luo, Z. (2016). An experimental study of the intrinsic stability of random forest variable importance measures. BMC Bioinformatics, 17. https://doi.org/10.1186/s12859-016-0900-5. DOI: https://doi.org/10.1186/s12859-016-0900-5

[20] Loecher, M. (2020). Unbiased variable importance for random forests. Communications in Statistics - Theory and Methods, 51, 1413 - 1425. https://doi.org/10.1080/03610926.2020.1764042. DOI: https://doi.org/10.1080/03610926.2020.1764042

[21] Denisko, D., & Hoffman, M. (2018). Classification and interaction in random forests. Proceedings of the National Academy of Sciences, 115, 1690 - 1692. https://doi.org/10.1073/pnas.1800256115. DOI: https://doi.org/10.1073/pnas.1800256115

[22] Xu, R., Li, M., Yang, Z., Yang, L., Qiao, K., & Shang, Z. (2021). Dynamic feature selection algorithm based on Q-learning mechanism. Applied Intelligence, 1-12. DOI: https://doi.org/10.1007/s10489-021-02257-x