Prioritise Five Tafseer Translators Using Clustering Technique for Surah Al-Baqarah

Authors

  • Mohammed A. Ahmed Network Engineering Department, College of Engineering, Al-Iraqia University, 10053, Baghdad, Iraq
  • Shahad Mahgoob Nafl College of Medicine, University of Baghdad, Iraq
  • Hanif Baharin Institute of Visual Informatics, Universiti Kebangsaan Malaysia, Malaysia
  • Puteri Nor Ellyza Nohuddin Institute of Visual Informatics, Universiti Kebangsaan Malaysia, Malaysia

DOI:

https://doi.org/10.58564/IJSER.3.1.2024.147

Keywords:

Dictionary, Image compression, Lossless Compression, LZW algorithm, photographs

Abstract

The English Tafseer Translation of the Holy Quran is essential for comprehending and interpreting Allah’s words for non-Arabic Muslims. This research adopted five different English translators (TR1-TR5) of chapter (Surah) Al-Baqarah and invested the advantages of the text clustering process to rank (prioritise) between these input five datasets. The absence of dataset ground truth (not standard datasets) requires the use of unsupervised learning (clustering technique) instead of other techniques (e.g. classification (supervised learning)). This study expanded the assessment to include both partitioning-based and hierarchical-based clustering algorithms. In a cluster based on partitioning, k-means is utilized. While for the hierarchical-based, the Agglomerative has been implemented. This research’s aim was achieved through a three-step procedure (stages). The first stage uses text cleansing to remove unnecessary words (Tokenisation, POS tagging, normalisation, stemming, and Stop-word removal). In addition, feature selection used VSM (Vector Space Model) and TF-IDF (Term Frequency-Inverse Document Frequency) to make the five corpora. The second stage implemented the clustering process. In the third stage, clustering validation was obtained using SC (Silhouette Coefficient) and DBI (Davies-Bouldin Index) metrics plus the execution time (ET). Principle Component Analysis (PCA) is used to visualise the clustering outputs. The results show, based on (ET, SC, and DBI) of the k-means algorithm, only ranks (1) and (3) demonstrate the same ranking for these five translators. In contrast, the Agglomerative algorithm shows the same five translators’ positions; each (ET, SC, and DBI) has a distinct rank. However, to obtain the optimal union rank, it is crucial to use a modern approach technique such as MCDM (Multi-Criteria Decision-making Analysis) in future work.

References

K. Zebiri, “Neal Robinson: Discovering the Qur’an: a contemporary approach to a veiled text. xiv, 332 pp. London: SCM Press Ltd., 1996.£16.95.,” Bull. Sch. Orient. African Stud., vol. 61, no. 3, pp. 538–540, 1998.

A. H. M. Ragab and A. S. Bajnaid, “An Effective-Adaptive E-learning System Based on Multi-Styles Assessment,” in 7th Annual Symposium on Learnining and Technology, the Edutainment Effat Univ. King AbdulAziz University, Jeddah, Saudi Arabia, 2009, pp. 10–11.

D. E. Smith, “The structure of al-Baqarah,” Muslim World, vol. 91, no. 1/2, p. 121, 2001.

C. Hadhiri, Klasifikasi Kandungan Al-Qur’an. Jakarta: Gema Insani, 1993.

S. Sadiq, A comparative study of four English translations of Sûrat Ad-Dukhân on the semantic level. Cambridge Scholars Publishing, 2010.

C. Blake, Text mining, vol. 45. 2011. doi: 10.1002/aris.2011.1440450110.

M. N. Al-Kabi, H. A. Wahsheh, and I. M. Alsmadi, “A Topical Classification of Hadith Arabic Text,” in Proceedings - 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences, NOORIC 2013Taibah University International Conference on Advances in Information Technology, 2013, pp. 252–257.

C. Qi, L. Jianfeng, and Z. Hao, “A text mining model based on improved density clustering algorithm,” in 2013 IEEE 4th International Conference on Electronics Information and Emergency Communication, 2013, pp. 337–339.

M. Alhawarat, M. Hegazi, and A. Hilal, “Processing the text of the Holy Quran: a text mining study,” Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 2, pp. 262–267, 2015.

A. Fahad et al., “A survey of clustering algorithms for big data: Taxonomy and empirical analysis,” IEEE Trans. Emerg. Top. Comput., vol. 2, no. 3, pp. 267–279, 2014.

A. Aslani and M. Esmaeili, “Finding Frequent Patterns in Holy Quran UsingText Mining,” Signal Data Process., vol. 15, no. 3, 2018, doi: 10.29252/jsdp.15.3.89.

S. J. Putra, T. Mantoro, and M. N. Gunawan, “Text mining for Indonesian translation of the Quran: A systematic review,” 3rd Int. Conf. Comput. Eng. Des. ICCED 2017, vol. 2018-March, pp. 1–5, 2018, doi: 10.1109/CED.2017.8308122.

C. Luque, J. M. Luna, M. Luque, and S. Ventura, “An advanced review on text mining in medicine,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 9, no. 3, p. e1302, 2019.

A. K. Abasi, A. T. Khader, M. A. Al-Betar, S. Naim, Z. A. A. Alyasseri, and S. N. Makhadmeh, “A novel hybrid multi-verse optimizer with K-means for text documents clustering,” Neural Comput. Appl., 2020, doi: 10.1007/s00521-020-04945-0.

J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques, 3rd ed. Elsevier, 2012.

W. A. Mohotti, “Unsupervised text mining: Effective similarity calculation with ranking and matrix factorization,” Queensland University of Technology, 2020.

E. Rendón, I. Abundez, A. Arizmendi, and E. M. Quiroz, “Internal versus external cluster validation indexes,” Int. J. Comput. Commun., vol. 5, no. 1, pp. 27–34, 2011.

S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. theory, vol. 28, no. 2, pp. 129–137, 1982.

W. H. E. Day and H. Edelsbrunner, “Efficient algorithms for agglomerative hierarchical clustering methods,” J. Classif., vol. 1, no. 1, pp. 7–24, 1984.

S. J. Putra, R. H. Gusmita, K. Hulliyah, and H. T. Sukmana, “A semantic-based question answering system for indonesian translation of Quran,” in Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, 2016, pp. 504–507.

C. Slamet, A. Rahman, M. A. Ramdhani, and W. Darmalaksana, “Clustering the verses of the Holy Qur’an using K-means algorithm,” Asian J. Inf. Technol., vol. 15, no. 24, pp. 5159–5162, 2016.

S. J. Putra, K. Hulliyah, N. Hakiem, R. P. Iswara, and A. F. Firmansyah, “Generating weighted vector for concepts in indonesian translation of Quran,” in Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, 2016, pp. 293–297.

H. T. Sukmana, R. H. Gusminta, Y. Durachman, and A. F. Firmansyah, “Semantically annotated corpus model of Indonesian Translation of Quran: An effort in increasing question answering system performance,” in 2016 4th International Conference on Cyber and IT Service Management, 2016, pp. 1–5.

B. Hamoud and E. Atwell, “Quran question and answer corpus for data mining with WEKA,” in 2016 Conference of Basic Sciences and Engineering Studies (SGCAC), 2016, pp. 211–216.

S. Chua and P. N. E. Nohuddin, “Relationship Analysis of Keyword and Chapter in Malay-Translated Tafseer of Al-Quran,” J. Telecommun. Electron. Comput. Eng., vol. 9, no. 2–10, pp. 185–189, 2017.

A. F. Huda, M. R. Deyana, Q. U. Safitri, W. Darmalaksana, U. Rahmani, and others, “Analysis Partition Clustering and Similarity Measure on Al-Quran Verses,” in 2019 IEEE 5th International Conference on Wireless and Telematics (ICWT), 2019, pp. 1–5.

Z. Indra, A. Adnan, and R. Salambue, “A Hybrid Information Retrieval for Indonesian Translation of Quran by Using Single Pass Clustering Algorithm,” in 2019 Fourth International Conference on Informatics and Computing (ICIC), 2019, pp. 1–5.

R. S. Pratama, A. F. Huda, A. Wahana, W. Darmalaksana, Q. U. Safitri, and A. Rahman, “Analysis of Fuzzy C-Means Algorithm on Indonesian Translation of Hadits Text,” in 2019 IEEE 5th International Conference on Wireless and Telematics (ICWT), 2019, pp. 1–5.

M. A. Ahmed, H. Baharin, and P. N. E. Nohuddin, “Analysis of K-means, DBSCAN and OPTICS Cluster algorithms on Al-Quran verses,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 8, pp. 248–254, 2020, doi: 10.14569/IJACSA.2020.0110832.

M. A. Ahmed, H. Baharin, and P. N. E. Nohuddin, “Mini-Batch k- Means versus k- Means to Cluster English Tafseer Text : View of Al-Baqarah Chapter,” JOURNALOF QURANIC Sci. Res., vol. 2, no. 2, pp. 48–53, 2021.

S. J. Putra, R. H. Gusmita, K. Hulliyah, and H. T. Sukmana, “A semantic-based question answering system for indonesian translation of Quran,” in Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, 2016, pp. 504–507. doi: 10.1145/3011141.3011219.

S. Chua and P. N. E. binti Nohuddin, “Frequent pattern extraction in the Tafseer of Al-Quran,” in The 5th International Conference on Information and Communication Technology for The Muslim World (ICT4M), 2014, pp. 1–5. doi: 10.1109/ICT4M.2014.7020667.

M. Z. Husin, S. Saad, and S. A. M. Noah, “Syntactic rule-based approach for extracting concepts from quranic translation text,” in 2017 6th International Conference on Electrical Engineering and Informatics (ICEEI), 2017, pp. 1–6.

W. Wu, Z. Xu, G. Kou, and Y. Shi, “Decision-making support for the evaluation of clustering algorithms based on MCDM,” Complexity, vol. 2020, 2020.

G. Forman and E. Kirshenbaum, “Extremely fast text feature extraction for classification and indexing,” in Proceedings of the 17th ACM conference on Information and knowledge management, 2008, pp. 1221–1230.

C. C. Aggarwal and C. K. Reddy, Data clustering. Citeseer, 2014.

B. Bansal and S. Srivastava, “Hybrid attribute based sentiment classification of online reviews for consumer intelligence,” Appl. Intell., vol. 49, no. 1, pp. 137–149, 2019.

M. A. Ahmed, H. Baharin, and P. N. E. Nohuddin, “Text Clustering of Tafseer Translations by Using k-means Algorithm : An Al-Baqarah Chapter View,” Ann. Emerg. Technol. Comput., vol. 7, no. 4, pp. 27–34, 2023, doi: 10.33166/AETiC.2023.04.003.

P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” J. Comput. Appl. Math., vol. 20, pp. 53–65, 1987.

D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Trans. Pattern Anal. Mach. Intell., no. 2, pp. 224–227, 1979.

H. Hotelling, “Analysis of a complex of statistical variables into principal components.,” J. Educ. Psychol., vol. 24, no. 6, p. 417, 1933.

H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdiscip. Rev. Comput. Stat., vol. 2, no. 4, pp. 433–459, 2010.

M. A. Ahmed, H. Baharin, and P. N. E. Nohuddin, “k -means variations analysis for translation of English Tafseer Al-Quran text,” Int. J. Electr. Comput. Eng., vol. 13, no. 3, pp. 3255–3265, 2023, doi: 10.11591/ijece.v13i3.pp3255-3265.

R. L. Keeney, H. Raiffa, and others, Decisions with multiple objectives: preferences and value trade-offs. Cambridge university press, 1993.

O. S. Albahri et al., “Multidimensional benchmarking of the active queue management methods of network congestion control based on extension of fuzzy decision by opinion score method,” Int. J. Intell. Syst., 2020, doi: 10.1002/int.22322.

A. S. Albahri et al., “Integration of fuzzy-weighted zero-inconsistency and fuzzy decision by opinion score methods under a q-rung orthopair environment: a distribution case study of COVID-19 vaccine doses,” Comput. Stand. & Interfaces, vol. 80, p. 103572, 2022.

M. M. Salih, Z. T. Al-Qaysi, M. L. Shuwandy, M. A. Ahmed, K. F. Hasan, and Y. R. Muhsen, “A new extension of fuzzy decision by opinion score method based on Fermatean fuzzy: A benchmarking COVID-19 machine learning methods,” J. Intell. & Fuzzy Syst., no. Preprint, pp. 1–11, 2022, doi: 10.3233/JIFS-220707.

A. H. Alamoodi et al., “New Extension of Fuzzy-Weighted Zero-Inconsistency and Fuzzy Decision by Opinion Score Method Based on Cubic Pythagorean Fuzzy Environment: A Benchmarking Case Study of Sign Language Recognition Systems,” Int. J. Fuzzy Syst., pp. 1–18, 2022, doi: 10.1007/s40815-021-01246-z.

A. H. Alamoodi et al., “Based on neutrosophic fuzzy environment: a new development of FWZIC and FDOSM for benchmarking smart e-tourism applications,” Complex & Intell. Syst., pp. 1–25, 2022, doi: 10.1007/s40747-022-00689-7.

Downloads

Published

2024-03-01

How to Cite

A. Ahmed, M., Mahgoob Nafl, S., Baharin, H., & Ellyza Nohuddin, P. N. (2024). Prioritise Five Tafseer Translators Using Clustering Technique for Surah Al-Baqarah . Al-Iraqia Journal for Scientific Engineering Research, 3(1), 75–86. https://doi.org/10.58564/IJSER.3.1.2024.147

Issue

Section

Articles