Document Type

Thesis

Degree Name

Master of Applied Computing

Department

Physics and Computer Science

Program Name/Specialization

Applied Computing

Faculty/School

Faculty of Science

First Advisor

Dr. Emad Mohammed

Advisor Role

Supervision, Formal analysis, Investigation, Project administration, Validation, Editing, Review

Second Advisor

Dr. Sukhjit Singh Sehra

Advisor Role

Visualization, Validation, Writing – review and editing

Abstract

Effective and interpretable classification of medical images remains a critical challenge in computer-aided diagnosis, particularly in data-scarce and resource-constrained clinical settings where traditional deep learning models prove impractical. This study addresses the fundamental barrier to Vision Transformer adoption in medical imaging—massive parameter counts and data requirements—through a systematic two-phase methodology. Phase 1 evaluates three spline-based Kolmogorov–Arnold Network (KAN) variants to identify the optimal nonlinear approximation function for parameter-efficient medical image classification: SBTAYLOR-KAN (B-splines with Taylor series), SBRBF-KAN (B-splines with Radial Basis Functions), and SBWAVELET-KAN (B-splines with Morlet wavelets). Comprehensive experiments across brain MRI, chest X-rays, and tuberculosis datasets—without any image preprocessing beyond the datasets' original published forms—establish SBTAYLOR-KAN as the superior architecture, achieving up to 98.93% accuracy with only 2,872 parameters (>99% reduction versus ResNet50's ~24.18M) while maintaining 86% accuracy using merely 30% training data. Statistical validation through kappa coefficients, Matthews correlation, and cross-dataset generalization confirms its robustness and data efficiency. Building directly on these findings, Phase 2 integrates the optimal Taylor-KAN formulation into a Vision Transformer architecture, yielding TaylorKAN-ViT—an ultra-compact model that fundamentally reimagines efficient transformer design through a KAN-first approach. Unlike conventional ViTs that retrofit parameter reduction techniques while retaining MLP-heavy architectures, TaylorKAN-ViT replaces all nonlinear feed-forward mappings with the Taylor-series-approximated KAN modules identified in Phase 1. This enables dual-scale feature learning: KAN transformations capture fine-grained local patterns within image patches, while self-attention mechanisms model long-range global dependencies across the entire image. Despite comprising only 88.9K parameters and 4.9G FLOPs—representing ~99.3% parameter reduction compared to recent medical ViTs (MedKAFormer-T: 12.47M, MedViTV2-T: 12.3M)—TaylorKAN-ViT achieves competitive performance across four diverse benchmarks: 94.36% accuracy on PneumoniaMNIST, 95.90% on CPNX-ray, 61.00% on PAD-UFES-20, and 70.50% on Kvasir. The model demonstrates stable generalization even under limited-data and class-imbalanced conditions. These results demonstrate that clinical-grade medical image classification is achievable without large-scale models, establishing TaylorKAN-ViT as a practical, deployable solution for edge devices, mobile platforms, and resource-limited healthcare environments.

Convocation Year

2026

Convocation Season

Spring

Share

COinS