Full optimisation of imbalance techniques for Qur'anic data using genetic algorithm

Arkok, Bassam S.

Publication:
Full optimisation of imbalance techniques for Qur'anic data using genetic algorithm

Files

t11100485681Bassamsarkok_24.pdf (3.49 MB)

t11100485681Bassamsarkok_SEC.pdf (32.37 MB)

Date

2022

Authors

Arkok, Bassam S.

Publisher

Kuala Lumpur : Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, 2022

Subject LCSH

Genetic programming (Computer science)

Subject ICSI

Qur'an -- Tafsir -- Classification -- Data processing

Call Number

t QA 76.623 A721F 2022

Abstract

The holy Qur' an is the first fundamental resource of legislation and law in the Muslim community. The Qur'anic text has been covered by Islamic scholars to offer Qur'anic knowledge quickly and systematically, such as digital Qur'an, and Qur'anic computing. This is performed using the techniques of text mining to automate the Qur'anic text. The classification of Qur' anic verses is one of the focal points in many research, which is conducted through automatic Qur' anic classification. The purpose of Qur' anic classification is to assign the most appropriate topics that are predefined to a specified Qur'anic verse according to its content. However, some properties in the Qur'anic topics such as imbalanced classes could weaken the perfonnance of classification when these classes are classified using traditional classification. Imbalanced classes occur when the sample number of classes in the dataset is not equal. As noticed in the dictionary used in this research, many Qur'anic topics are unequal in the number of verses, which means the problem of imbalanced classes will occur when these topics are classified together using traditional classification. The main problem that this study tries to solve is obtaining equal accuracies for all classes of Qur'anic topics during the classification process. Therefore, this study attempts to explore a new approach to categorise the Qur' anic topics based on imbalanced learning and a genetic algorithm that is called optimisation learning. The technique of imbalanced classification was applied to solve the problem of imbalanced classes existing in the Qur' anic topics. The genetic algorithm was used as an optimisation objective before the implementation of classification. This optimisation was performed for the samples of Qur' anic text to adjust the convergence and spacing between the samples, whether in the same class or among the classes. This adjusting leads to improve the performance of Quranic topics classification. Three cases of optimisation were experimented in this study using the proposed techniques: partial optimisation with oversampling, full optimisation without oversampling, and full optimisation with oversampling. These cases were implemented by using three new oversampling methods, Genetic Oversampling (GOS) and Harmonised Oversampling method based on Genetic Algorithm (HOGA-I and HOGA- 2). In conclusion, the third case of optimisation achieved the best results. Meanwhile, all proposed methods outperformed significantly the other famous methods that have been used widely to classify imbalanced datasets, which are Synthetic Minority Oversampling Technique (SMOTE), Random Undersampling (RUS), and Random Oversampling (ROS). According to the experiment results, GOS method outperformed SMOTE and ROS methods, which were the second best methods among the other previous methods in Specificity metric by I 2% using the validation technique of I 0- fold cross-validation. Meanwhile, HOGA-I method outperfonned the closest method in Matthews Correlation Coefficient (MCC) metric by 7% using the validation technique of training-testing. HOGA-2 method, which was the best among all proposed oversampling methods, outperfonned all closest methods in Sensitivity/Recall, Balanced Accuracy, and Geometric Mean (G-Mean) metrics by I 0% using the validation technique of I 0-fold cross-validation.

URI

https://studentrepo.iium.edu.my/handle/123456789/32498

Collections

KICT - Doctoral Theses

Full item page

Publication:
Full optimisation of imbalance techniques for Qur'anic data using genetic algorithm

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Subject LCSH

Subject ICSI

Call Number

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Publication: Full optimisation of imbalance techniques for Qur'anic data using genetic algorithm

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Subject LCSH

Subject ICSI

Call Number

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Publication:
Full optimisation of imbalance techniques for Qur'anic data using genetic algorithm