Optimizing Phishing Threat Detection: A Comprehensive Study of Advanced Bagging Techniques and Optimization Algorithms in Machine Learning
DOI:
https://doi.org/10.58564/IJSER.3.1.2024.146Keywords:
Bagging Techniques, Ensemble learning, Particle swarm optimization algorithm, Phishing, Random ForestsAbstract
Bagging constitutes a prominent ensemble learning technique in contemporary machine learning. With this process, various instances of the base model are trained using various subsets of the training data that are extracted by bootstrapping. The resulting models are then aggregated, often through voting in a classification problem, to enhance performance and predictive power. Recent advances in bagging techniques include variants such as Random Forests, which introduce additional randomness by selecting a random subset of features in each partition and boosting algorithms that iteratively optimize the model's focus on misclassified instances. The efficacy of these strategies in enhancing the generality and adaptability of machine learning models has been impressive. There are many studies that confirm the ability of ensemble learning models to detect phishing attacks. However, the techniques used by these models that have enhanced their detection capabilities have not been highlighted. The study reached important results in terms of accuracy of up to 97% through the random forest model and the Particle swarm optimization algorithm. This study seeks to contribute to advancing the field of cybersecurity by providing a robust and interpretable machine learning-based classifier that can be integrated into a framework to detect phishing attacks by distinguishing between legitimate URLs and phishing URLs.
References
S. Aslam and A. B. Nassif, “Phish-identifier: Machine Learning based classification of Phishing attacks,” in 2023 Advances in Science and Engineering Technology International Conferences (ASET), 2023, pp. 1–6. doi: 10.1109/ASET56582.2023.10180869.
V. Gomes, J. Reis, and B. Alturas, “Social Engineering and the Dangers of Phishing,” Iberian Conference on Information Systems and Technologies, CISTI, vol. 2020-June. 2020. doi: 10.23919/CISTI49556.2020.9140445.
A. Kaur and S. M. Mian, “A Review on Phishing Technique: Classification, Lifecycle and Detection Approaches,” in 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 2023, pp. 336–339. doi: 10.1109/ICACITE57410.2023.10183292.
K. Patil and S. R. Arra, “Detection of Phishing and User Awareness Training in Information Security: A Systematic Literature Review,” in 2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM), 2022, pp. 780–786. doi: 10.1109/ICIPTM54933.2022.9753912.
K. Subashini and V. Narmatha, “Website Phishing Detection of Machine Learning Approach using SMOTE method,” in 2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT), 2023, pp. 1–5. doi: 10.1109/ICECCT56650.2023.10179745.
G. Apruzzese et al., “The Role of Machine Learning in Cybersecurity,” Digit. Threat., vol. 4, no. 1, Mar. 2023, doi: 10.1145/3545574.
I. D. Mienye and Y. Sun, “A survey of ensemble learning: Concepts, algorithms, applications, and prospects,” IEEE Access, vol. 10, pp. 99129–99149, 2022.
M. Ajdani and H. Ghaffary, “Introduced a new method for enhancement of intrusion detection with random forest and PSO algorithm,” Secur. Priv., vol. 4, no. 2, p. e147, 2021, doi: https://doi.org/10.1002/spy2.147.
M. Alanezi, “Phishing Detection Methods: A Review,” Tech. Rom. J. Appl. Sci. Technol., vol. 3, pp. 19–35, 2021, doi: 10.47577/technium.v3i9.4973.
R. Zieni, L. Massari, and M. C. Calzarossa, “Phishing or not phishing? A survey on the detection of phishing websites,” IEEE Access, vol. 11, pp. 18499–18519, 2023.
Z. Zhou and C. Zhang, “Phishing website identification based on double weight random forest,” in 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), 2022, pp. 263–266. doi: 10.1109/CVIDLICCEA56201.2022.9824544.
M. H. Alkawaz, S. Joanne Steven, O. F. Mohammad, and M. Gapar Md Johar, “Identification and Analysis of Phishing Website based on Machine Learning Methods,” in 2022 IEEE 12th Symposium on Computer Applications & Industrial Electronics (ISCAIE), 2022, pp. 246–251. doi: 10.1109/ISCAIE54458.2022.9794467.
K. Subashini and V. Narmatha, “Phishing Website Detection using Hyper-parameter Optimization and Comparison of Cross-validation in Machine Learning Based Solution,” in 2023 Third International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), 2023, pp. 1–6. doi: 10.1109/ICAECT57570.2023.10117851.
I. Emmanuel O., E. V. C., O. E. I., and N. P. C., “Overview of Recent Cyberattacks: A Systematic Review,” in 2023 International Conference on Science, Engineering and Business for Sustainable Development Goals (SEB-SDG), 2023, pp. 1–8. doi: 10.1109/SEB-SDG57117.2023.10124473.
W. Syafitri, Z. Shukur, U. A. Mokhtar, R. Sulaiman, and M. A. Ibrahim, “Social Engineering Attacks Prevention: A Systematic Literature Review,” IEEE Access, vol. 10, pp. 39325–39343, 2022, doi: 10.1109/ACCESS.2022.3162594.
A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K. Kifayat, “A comprehensive survey of AI-enabled phishing attacks detection techniques,” Telecommun. Syst., vol. 76, no. 1, pp. 139–154, 2021, doi: 10.1007/s11235-020-00733-2.
S. Merugula, K. S. Kumar, S. Muppidi, and C. Vidyadhari, “Stop Phishing : Master Anti-Phishing Techniques,” in 2022 IEEE North Karnataka Subsection Flagship International Conference (NKCon), 2022, pp. 1–5. doi: 10.1109/NKCon56289.2022.10126569.
I. Dunđer, S. Seljan, and M. Odak, “Data Acquisition and Corpus Creation for Phishing Detection,” in 2023 46th MIPRO ICT and Electronics Convention (MIPRO), 2023, pp. 533–538. doi: 10.23919/MIPRO57284.2023.10159904.
D. Savchuk and A. Doroshenko, “Investigation of machine learning classification methods effectiveness,” in 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), 2021, pp. 33–37. doi: 10.1109/CSIT52700.2021.9648582.
S. V Mahadevkar et al., “A Review on Machine Learning Styles in Computer Vision—Techniques and Future Directions,” IEEE Access, vol. 10, pp. 107293–107329, 2022, doi: 10.1109/ACCESS.2022.3209825.
G. Gupta, M. Sharma, S. Choudhary, and K. Pandey, “Performance Analysis of Machine Learning Classification Algorithms for Breast Cancer Diagnosis,” in 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2021, pp. 1–6. doi: 10.1109/ICRITO51393.2021.9596230.
T. T. Khoei, G. Aissou, W. C. Hu, and N. Kaabouch, “Ensemble Learning Methods for Anomaly Intrusion Detection System in Smart Grid,” in 2021 IEEE International Conference on Electro Information Technology (EIT), 2021, pp. 129–135. doi: 10.1109/EIT51626.2021.9491891.

Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Samer Kadhim Jawad, Satea H. Alnajjar

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.