Supervised Machine Learning a Brief Survey of Approaches

Authors

  • Esraa Najjar Computer Science Department, General Directorate of Education in Najaf Governorate, Al-Najaf, Iraq
  • Aqeel Majeed Breesam Institute of Medical Technology / Baghdad, Middle Technical University, Baghdad, Iraq

DOI:

https://doi.org/10.58564/IJSER.2.4.2023.121

Keywords:

Machine learning, Supervised learning, Accuracy, Classification Algorithms, Classifiers

Abstract

Machine learning has become popular across several disciplines right now. It enables machines to automatically learn from data and make predictions without the need for explicit programming or human intervention. Supervised machine learning is a popular approach to creating artificial intelligence. A computer algorithm is trained on input data that has been labeled for a certain output, making it one of two major areas of machine learning that has seen a lot of successful research. The model is trained until it can identify the underlying correlations and patterns between the input and output labels, enabling it to generate accurate labeling results when confronted with unexplored data. Supervised learning is good at solving classification and regression problems. The problem of regression occurs when the outputs are continuous, while the problem of classification occurs when the outputs are categorical. We will concentrate on the benefits and drawbacks of supervised learning algorithms in this review. Creating a precise model of the distribution of class labels in terms of predictor features is the aim of supervised learning. This work studied the most popular supervised machine learning methods, including Naive Bayes, Decision Trees, Support Vector Machines, Logistic Regression, K-Nearest Neighbors, and Deep Learning, which were discussed in this paper. We also emphasized the algorithms' advantages and disadvantages, and we ultimately talked about the difficulties in developing supervised machine learning algorithms.

References

T. M. Mitchell, “Machine learning and data mining,” Commun. ACM, vol. 42, no. 11, pp. 30–36, 1999.

D. Chaudhary and E. R. Vasuja, “A review on various algorithms used in machine learning,” Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., vol. 5, no. 2, pp. 915–920, 2019. , https://doi.org/10.32628/CSEIT1952248.

V. Nasteski, “An overview of the supervised machine learning methods,” Horizons. b, vol. 4, pp. 51–62, 2017. doi: 10.20544/HORIZONS.B.04.1.17.P05.

S. Ray, “A quick review of machine learning algorithms,” in 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), 2019, pp. 35–39. doi:10.1109/COMITCon.2019.8862451.

A. Géron, Hands-On Machine Learning with Scikit-Learn & TensorFlow. 2017.

R. Russell, Machine Learning: Step-by-Step Guide To Implement Machine Learning Algorithms with Python. (Knxb), 2020. http://repository.vnu.edu.vn/handle/VNU_123/92775.

E. Alpaydin, Introduction to machine learning. MIT press, 2020.

J. Bhattacharjee, “Supervised Learning BT - Practical Machine Learning with Rust: Creating Intelligent Applications in Rust,” J. Bhattacharjee, Ed. Berkeley, CA: Apress, 2020, pp. 31–105. doi: 10.1007/978-1-4842-5121-8.

I. Muhammad and Z. Yan, “SUPERVISED MACHINE LEARNING APPROACHES: A SURVEY.,” ICTACT J. Soft Comput., vol. 5, no. 3, 2015.

G. E. Batista and M. C. Monard, “An analysis of four missing data treatment methods for supervised learning,” Appl. Artif. Intell., vol. 17, no. 5–6, pp. 519–533, 2003. https://doi.org/10.1080/713827181.

V. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artif. Intell. Rev., vol. 22, no. 2, pp. 85–126, 2004. https://doi.org/10.1023/B:AIRE.0000045502.10941.a9.

L. Yu and H. Liu, “Efficient feature selection via analysis of relevance and redundancy,” J. Mach. Learn. Res., vol. 5, pp. 1205–1224, 2004.

R. Caruana and A. Niculescu-Mizil, “An empirical comparison of supervised learning algorithms,” in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 161–168.

P. Buczak, H. Huang, B. Forthmann, and P. Doebler, “The machines take over: A comparison of various supervised learning approaches for automated scoring of divergent thinking tasks,” J. Creat. Behav., vol. 57, no. 1, pp. 17–36, 2023. https://doi.org/10.1002/jocb.559

• [15] P. Chaudhari, H. Agarwal, and V. Bhateja, “Data augmentation for cancer classification in oncogenomics: an improved KNN based approach,” Evol. Intell., vol. 14, pp. 489–498, 2021. https://doi.org/10.1007/s12065-019-00283-w

J. Xu, H. Mu, Y. Wang, and F. Huang, “Feature genes selection using supervised locally linear embedding and correlation coefficient for microarray classification,” Comput. Math. Methods Med., vol. 2018, 2018.

E. Dritsas and M. Trigka, “Supervised machine learning models for liver disease risk prediction,” Computers, vol. 12, no. 1, p. 19, 2023. https://doi.org/10.3390/computers12010019

M. Hossin and M. N. Sulaiman, “A review on evaluation metrics for data classification evaluations,” Int. J. data Min. Knowl. Manag. Process, vol. 5, no. 2, p. 1, 2015. doi : 10.5121/ijdkp.2015.5201.

M. W. Browne, “Cross-validation methods,” J. Math. Psychol., vol. 44, no. 1, pp. 108–132, 2000.

P. Geurts, A. Irrthum, and L. Wehenkel, “Supervised learning with decision tree-based methods in computational and systems biology,” Mol. Biosyst., vol. 5, no. 12, pp. 1593–1605, 2009. https://doi.org/10.1006/jmps.1999.1279.

S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” Emerg. Artif. Intell. Appl. Comput. Eng., vol. 160, no. 1, pp. 3–24, 2007.

S. L. Ting, W. H. Ip, and A. H. C. Tsang, “Is Naive Bayes a good classifier for document classification,” Int. J. Softw. Eng. Its Appl., vol. 5, no. 3, pp. 37–46, 2011.

A. Aldelemy and R. A. Abd-Alhameed, “Binary Classification of Customer’s Online Purchasing Behavior Using Machine Learning,” J. Tech., vol. 5, no. 2, pp. 163–186, 2023. https://doi.org/10.51173/jt.v5i2.1226.

AnujSharma and S. Dey, “A comparative study of feature selection and machine learning techniques for sentiment analysis,” in Proceedings of the 2012 ACM research in applied computation symposium, 2012, pp. 1–7. https://doi.org/10.1145/2401603.2401605.

H. Kang, S. J. Yoo, and D. Han, “Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews,” Expert Syst. Appl., vol. 39, no. 5, pp. 6000–6010, 2012. https://doi.org/10.1016/j.eswa.2011.11.107.

K. Dave, S. Lawrence, and D. M. Pennock, “Mining the peanut gallery: Opinion extraction and semantic classification of product reviews,” in Proceedings of the 12th international conference on World Wide Web, 2003, pp. 519–528. https://doi.org/10.1145/775152.775226.

A. Navada, A. N. Ansari, S. Patil, and B. A. Sonkamble, “Overview of use of decision tree algorithms in machine learning,” Proc. - 2011 IEEE Control Syst. Grad. Res. Colloquium, ICSGRC 2011, pp. 37–42, 2011. doi:10.1109/ICSGRC.2011.5991826.

A. Bharathi and E. Deepankumar, “Survey on classification techniques in data mining,” Int. J. Recent Innov. Trends Comput. Commun., vol. 2, no. 7, pp. 1983–1986, E-ISSN: 2347-2693, 2014.

Y.-Y. Song and L. U. Ying, “Decision tree methods: applications for classification and prediction,” Shanghai Arch. psychiatry, vol. 27, no. 2, p. 130, 2015. doi: 10.11919/j.issn.1002-0829.215044

F. Hajjej, M. A. Alohali, M. Badr, and M. A. Rahman, “A comparison of decision tree algorithms in the assessment of biomedical data,” Biomed Res. Int., vol. 2022, 2022. https://doi.org/10.1155/2022/9449497.

M. H. Dunham, Data mining: Introductory and advanced topics. Pearson Education India, 2006.

V. Vapnik, The nature of statistical learning theory. Springer science & business media, 2013.

X. Wu and V. Kumar, The top ten algorithms in data mining. CRC press, 2009.

S. Suppharangsan, “Comparison and performance enhancement of modern pattern classifiers.” University of Southampton, 2010. URI: http://eprints.soton.ac.uk/id/eprint/170393.

Z. G. Hadi, A. R. Ajel, and A. Q. Al-Dujaili, “Comparison Between Convolutional Neural Network CNN and SVM in Skin Cancer Images Recognition,” J. Tech., vol. 3, no. 4, pp. 15–22, 2021. https://doi.org/10.51173/jt.v3i4.390.

M. Awad, R. Khanna, M. Awad, and R. Khanna, “Support vector machines for classification,” Effic. Learn. Mach. Theor. Concepts, Appl. Eng. Syst. Des., pp. 39–66, 2015.

A. Bilski, “A review of artificial intelligence algorithms in document classification,” Int. J. Electron. Telecommun., vol. 57, no. 3, pp. 263–270, 2011. DOI: 10.2478/v10177-011-0035-6 [38] N. R. Dzakiyullah, B. Hussin, C. Saleh, and A. M. Handani, “Comparison neural network and support vector machine for production quantity prediction,” Adv. Sci. Lett., vol. 20, no. 10–11, pp. 2129–2133, 2014. https://doi.org/10.1166/asl.2014.5708

D. W. Hosmer Jr, S. Lemeshow, and R. X. Sturdivant, Applied logistic regression, vol. 398. John Wiley & Sons, 2013.

M. Srivenkatesh, “Prediction of prostate cancer using machine learning algorithms,” Int. J. Recent Technol. Eng, vol. 8, no. 5, pp. 5353–5362, 2020.

B. Mahesh, “Machine learning algorithms-a review,” Int. J. Sci. Res. (IJSR).[Internet], vol. 9, pp. 381–386, 2020. DOI: 10.21275/ART20203995

S. Cost and S. Salzberg, “A weighted nearest neighbor algorithm for learning with symbolic features,” Mach. Learn., vol. 10, pp. 57–78, 1993. https://doi.org/10.1007/BF00993481

S. A. Medjahed, T. A. Saadi, and A. Benyettou, “Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules,” Int. J. Comput. Appl., vol. 62, no. 1, 2013. doi: 10.5120/10041-4635.

L. Deng and D. Yu, “Deep learning: methods and applications,” Found. trends® signal Process., vol. 7, no. 3–4, pp. 197–387, 2014.

A. B. Abdusalomov, M. Mukhiddinov, and T. K. Whangbo, “Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging,” Cancers (Basel)., vol. 15, no. 16, p. 4172, 2023.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. doi:10.1038/nature14539.

S. Agatonovic-Kustrin and R. Beresford, “Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research,” J. Pharm. Biomed. Anal., vol. 22, no. 5, pp. 717–727, 2000. https://doi.org/10.1016/S0731-7085(99)00272-1.

S.-C. Wang, “Artificial neural network,” in Interdisciplinary computing in java programming, Springer, 2003, pp. 81–100. https://doi.org/10.1007/978-1-4615-0377-4_5.

A. K. Tyagi, A. Abraham, and A. Kaklauskas, Intelligent interactive multimedia systems for e-healthcare applications. Springer, 2022. https://doi.org/10.1007/978-981-16-6542-4.

K. Fukushima, S. Miyake, and T. Ito, “Neocognitron: A neural network model for a mechanism of visual pattern recognition,” IEEE Trans. Syst. Man. Cybern., no. 5, pp. 826–834, 1983. doi: 10.1109/TSMC.1983.6313076.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Adv. Neural Inf. Process. Syst., vol. 25, 2012. https://doi.org/10.1145/3065386

M. Khan, B. Jan, H. Farman, J. Ahmad, H. Farman, and Z. Jan, “Deep learning methods and applications,” Deep Learn. Converg. to big data Anal., pp. 31–42, 2019.

L. R. Medsker, “RECURRENT NEURAL Design and Applications,” New York, 2001.

A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, and S. W. Baik, “Action recognition in video sequences using deep bi-directional LSTM with CNN features,” IEEE access, vol. 6, pp. 1155–1166, 2017.

T. Jiang, J. L. Gradus, and A. J. Rosellini, “Supervised Machine Learning: A Brief Primer,” Behav. Ther., vol. 51, no. 5, pp. 675–687, 2020. doi: 10.1016/j.beth.2020.05.002.

H. Zhang and L. Jiang, “Fine tuning attribute weighted naive Bayes,” Neurocomputing, vol. 488, pp. 402–411, 2022. https://doi.org/10.1016/j.neucom.2022.03.020.

S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas, “Machine learning: a review of classification and combining techniques,” Artif. Intell. Rev., vol. 26, no. 3, pp. 159–190, 2006. https://doi.org/10.1007/s10462-007-9052-3.

R. Kothari and M. Dong, “Decision trees for classification: A review and some new results,” Pattern Recognit. from Class. to Mod. approaches, pp. 169–184, 2001. https://doi.org/10.1142/9789812386533_0006.

M. O. Khairandish, M. Sharma, V. Jain, J. M. Chatterjee, and N. Z. Jhanjhi, “A hybrid CNN-SVM threshold segmentation approach for tumor detection and classification of MRI brain images,” Irbm, vol. 43, no. 4, pp. 290–299, 2022. https://doi.org/10.1016/j.irbm.2021.06.003.

S. Archana and K. Elangovan, “Survey of classification techniques in data mining,” Int. J. Comput. Sci. Mob. Appl., vol. 2, no. 2, pp. 65–71, ISSN: 2321-8363, 2014.

E. Kuzmenko, L. Rumankova, I. Benesova, and L. Smutka, “Czech comparative advantage in agricultural trade with regard to EU-27: Main developmental trends and peculiarities,” Agriculture, vol. 12, no. 2, p. 217, 2022. https://doi.org/10.3390/agriculture12020217

J. V Tu, “Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes,” J. Clin. Epidemiol., vol. 49, no. 11, pp. 1225–1231, 1996. https://doi.org/10.1016/S0895-4356(96)00002-9.

M. E. Yahia and B. A. Ibrahim, “K-nearest neighbor and C4. 5 algorithms as data mining methods: advantages and difficulties,” Comput. Syst. Appl., vol. 103, pp. 103–109, 2003. doi: 10.1109/AICCSA.2003.1227535.

A. T. Imam, “Relative-fuzzy: A novel approach for handling complex ambiguity for software engineering of data mining models,” 2010.

B. Tugrul, E. Elfatimi, and R. Eryigit, “Convolutional neural networks in detection of plant leaf diseases: A review,” Agriculture, vol. 12, no. 8, p. 1192, 2022. https://doi.org/10.3390/agriculture12081192.

Downloads

Published

2023-12-01

How to Cite

Najjar, E., & Majeed Breesam, A. (2023). Supervised Machine Learning a Brief Survey of Approaches. Al-Iraqia Journal for Scientific Engineering Research, 2(4), 71–82. https://doi.org/10.58564/IJSER.2.4.2023.121

Issue

Section

Articles