Master of Applied Computing
Physics and Computer Science
Faculty of Science
read through several drafts of my work and provided feedback on it. consult as a committee and with me in determining that me is ready to proceed to the oral defence
The objective of feature selection in the realms of machine learning and data mining is integral, serving as an efficient mechanism to eradicate redundant or irrelevant features, and subsequently augmenting the performance of predictive models. In the contemporary landscape of big data, with the escalating dimensionality of datasets, the efficacy of traditional feature selection methodologies is compromised, due to their computational complexity and ineptitude in addressing the curse of dimensionality. This thesis posits a pioneering feature selection framework that amalgamates machine learning with advanced optimization algorithms. The methodology employs a Support Vector Machine (SVM), in conjunction with a cutting-edge metaheuristic algorithm, namely the Black Widow Optimization (BWO) algorithm, as a means to address feature selection (FS) challenges. The SVM, renowned for its robustness and ability in addressing complex classification dilemmas, was strategically amalgamated with the binary form of BWO. Additionally, this study delves into the integration of a recently formulated K-Nearest Neighbor (KNN) algorithm with BWO, utilizing an innovative set of classification metrics. The empirical evaluation of the proposed methodologies was conducted through two distinct experimental sets. The inaugural set of experiments was dedicated to the comparative ii analysis of the binary BWO with SVM (BBWO-SVM) against its counterpart, the binary BWO with KNN (BBWO-KNN). The subsequent set of experiments aimed to place the performance of BBWO-SVM and BBWO-KNN against six globally renowned metaheuristic algorithms. Both experimental sets utilized a comprehensive array of metrics, encompassing the number of features selected, accuracy, recall, precision, and the F1-Score, as the basis for performance comparison. The datasets employed for these experiments comprised 28 public datasets of varying magnitudes, as sourced from the UCI repository. The findings gleaned from the experimental analysis attest to the superior performance of BBWO-SVM, as it transcended the traditional algorithms and manifested exceptional prowess in enhancing classification performance across an array of benchmark datasets. The empirical evidence further substantiates the potential of BBWO-SVM as a versatile tool applicable across diverse domains, inclusive of healthcare, finance, and cybersecurity.
Yu, Shizhao, "Applying Machine Learning and Optimization Algorithms to Perform Feature Selection" (2025). Theses and Dissertations (Comprehensive). 2620.
Available for download on Tuesday, June 10, 2025