Document Type

Thesis

Degree Name

Master of Applied Computing

Department

Physics and Computer Science

Program Name/Specialization

Applied Computing

Faculty/School

Faculty of Science

First Advisor

Emad Mohammed

Advisor Role

Supervisor

Second Advisor

Saiqa Aleem

Advisor Role

Co-Supervisor

Abstract

Deploying deep learning models for medical image analysis on mobile devices requires a balance between inference latency, memory footprint, and delineating anatomical boundaries with high accuracy. While Convolutional Neural Networks (CNNs) and mobile Vision Transformers (ViTs) offer efficiency, they often struggle to model the irregular, non-local geometric structures inherent in biological tissues without incurring prohibitive computational costs. In this thesis, we introduce GeoViG (Geometric Vision Graph), an architecture that bridges the gap between efficient grid-based processing and explicit Geometric Deep Learning. GeoViG introduces a novel transition from high-resolution pixel grids to low-resolution dynamic graphs via a SpreadEdgePool operator, a geometry-aware downsampling mechanism. This operator aggregates features based on diffusion distance rather than fixed spatial strides, effectively preserving fine-grained structural diversity while reducing dimensionality. Experimental results show that GeoViG achieves a Top-1 accuracy of up to 82.38% on ImageNet-1K and competitive mean average precision (mAP) on MS COCO, utilizing 30% fewer parameters with up to a 2.35× speedup on mobile GPUs (iPhone 13 GPU). Crucially, for medical segmentation tasks on Kvasir-SEG and DSB 2018, GeoViG outperforms significantly larger models in boundary adherence. GeoViG achieves a Dice Score of 0.945 (vs. 0.875 for ResNet50) while reducing the Hausdorff Distance by over 5× (from 70.37 to 12.94). GeoViG eliminates outlier artifacts and captures fine-grained irregular anatomical structures, making it suitable for portable medical diagnostics. This work also explores the foundation for PureViG architectures, aiming for fully graph-based visual reasoning.

Convocation Year

2026

Convocation Season

Spring

Share

COinS