Parking Vision System

See Project with Full Code on GitHub

🎯 The Objective

This project implements a robust deep learning vision pipeline to automatically detect and classify parking space availability (Occupied, Free, Unavailable) from camera feeds. Designed to operate in varying lighting and weather conditions, the system optimizes space utilization and provides the spatial awareness logic necessary for autonomous parking maneuvers.

🏗️ The Architecture & Methodology

The system was engineered through a progressive deep learning workflow, transitioning from custom architectures to state-of-the-art transfer learning:

1. Data Preprocessing: Engineered a rigorous pipeline that cleans corrupted images, removes duplicates, corrects EXIF orientation issues, and applies standard ImageNet normalization.
2. Baseline CNNs & Augmentation: Built and trained lightweight Convolutional Neural Networks (CNNs) from scratch. Implemented advanced data augmentation strategies (RandAugment, ColorJitter, and Outdoor-specific transforms via Albumentations) to simulate changing environmental conditions and improve generalization.
3. Transfer Learning (FE vs. FT): Leveraged powerful architectures pre-trained on ImageNet to recognize common shapes and metallic textures. Rigorously tested both Feature Extraction (freezing the backbone to prevent overfitting) and Fine-Tuning (unfreezing the backbone with a microscopic 1e-5 learning rate) across modern network architectures.

📊 Key Performance Metrics

The transfer learning models demonstrated exceptional capability in extracting spatial features, with EfficientNet providing the most efficient parameter utilization and highest overall accuracy.

Top Model: EfficientNet-B0 (Feature Extraction)
Test Accuracy (EfficientNet-B0 FE): 88.33%
Test Accuracy (MobileNetV3 FE): 85.42%
Test Accuracy (ResNet18 FT): 80.75%

💡 Core Insights & Business Impact

Architectural Efficiency: EfficientNet-B0 systematically outperformed standard ResNet architectures while utilizing significantly fewer parameters, making it highly stable and deployment-ready for edge-computing devices.
Feature Extraction Superiority: Freezing the pre-trained backbone (Feature Extraction) yielded higher test accuracy and less overfitting compared to full network fine-tuning. This demonstrates that general ImageNet features are highly transferable to vehicular and spatial detection out-of-the-box.
Robustness to Distortion: Applying randomized geometric and color distortion policies (RandAugment) successfully simulated real-world camera noise, proving vital for a system intended for live outdoor parking lot feeds.

⚙️ Technical Stack

Languages & Frameworks: Python, PyTorch, Torchvision.
Libraries: pandas, Matplotlib, Albumentations (Outdoor transforms), PIL (ImageOps).
Architectures: Custom CNNs, ResNet18, EfficientNet-B0, MobileNetV3-Large.