Document Type

Thesis

Degree Name

Master of Applied Computing

Department

Physics and Computer Science

Program Name/Specialization

Applied Computing

Faculty/School

Faculty of Science

First Advisor

Dr. Abdul-Rahman Mawlood-Yunis

Advisor Role

Main

Abstract

This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. Recently, a new metaheuristic algorithm, known as the Black Widow Optimization (BWO) algorithm, had great results when applied to a range of daunting design problems in the field of engineering, and has not yet been applied to FS problems. In this thesis, we are proposing a modified Binary Black Widow Optimization (BBWO) algorithm to solve FS problems. The FS evaluation method used in this study is the wrapper method, designed to keep a degree of balance between two significant processes: (i) minimize the number of selected features (ii) maintain a high level of accuracy. To achieve this, we have used the k-nearest-neighbor (KNN) machine learning algorithm in the learning stage intending to evaluate the accuracy of the solutions generated by the (BBWO). The proposed method is applied to twenty-eight public datasets provided by UCI. The results are then compared with up-to-date FS algorithms. Our results show that the BBWO works as good as, or even better in some cases, when compared to those FS algorithms. However, the results also show that the BBWO faces the problem of slow convergence due to the use of a population of solutions and the lack of local exploitation. To further improve the exploitation process and enhance the BBWO’s performance, we are proposing an improvement to the BBWO algorithm by combining it with a local metaheuristic algorithm based on the hill-climbing algorithm (HCA). This improvement method (IBBWO) is also tested on the twenty-eight datasets provided by UCI and the results are then compared with the basic BBWO and the up-to-date FS algorithms. Results show that the (IBBWO) produces better results in most cases when compared to basic BBWO. The results also show that IBBWO outperforms the most known FS algorithms in many cases.

Convocation Year

2021

Convocation Season

Spring

Share

COinS