This project demonstrates the implementation of a custom vehicle recognition and segmentation system using YOLOv8 trained on a specially curated vehicle dataset. The system has been designed to identify and segment three distinct vehicle types (cars, pickups, and trucks) using instance segmentation techniques that provide pixel-level precision in vehicle identification.
The project addresses the growing demand for automated vehicle classification in applications such as traffic monitoring, parking management, toll collection systems, and intelligent transportation systems. By training a custom YOLOv8 segmentation model on a specialized dataset, the system achieves high accuracy in distinguishing between similar vehicle categories that typically pose challenges for general-purpose models.
The implementation showcases a complete machine learning workflow, from data collection and preprocessing through model training, evaluation, and deployment. Using Roboflow for dataset management and annotation, and Ultralytics YOLOv8 for model training, the project offers a practical example of custom object segmentation for domain-specific applications.
Core Features:
The project follows a structured deep learning workflow for custom vehicle segmentation, leveraging YOLOv8's state-of-the-art instance segmentation capabilities. The methodology combines careful dataset preparation, strategic data augmentation, and systematic model training to achieve optimal performance.
Dataset Preparation: The custom dataset consists of 60 manually selected and annotated vehicle images distributed equally across three classes (cars, pickups, trucks). Images were annotated using Roboflow's segmentation tools, creating precise polygon masks for each vehicle instance.
Data Augmentation: To enhance model generalization and prevent overfitting on the small dataset, multiple augmentation techniques were applied, including horizontal flips, random crops (0-20% zoom), rotation (±10°), grayscale conversion, and blur effects.
Model Training: The YOLOv8x-seg model (extra-large variant) was initialized with pre-trained weights and fine-tuned for 50 epochs on the custom dataset. The training utilized automatic mixed precision (AMP) for faster computation and included mosaic augmentation for the first 40 epochs.
The system comprises four main components:
The 60-image dataset was strategically divided:
Additionally, 15 completely new images (5 per class) were used for external validation to assess real-world performance.
Precision: Measures the accuracy of positive predictions
$$\text{Precision} = \frac{TP}{TP + FP}$$
where $TP$ denotes true positives and $FP$ denotes false positives.
Recall: Measures the ability to find all positive instances
$$\text{Recall} = \frac{TP}{TP + FN}$$
where $FN$ denotes false negatives.
F1 Score: Harmonic mean of precision and recall
$$F_1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$$
mAP (mean Average Precision): Average precision across all classes at specified IoU thresholds
YOLOv8 segmentation loss function combines:
The total loss function can be expressed as:
$$\mathcal{L}_{\text{total}} = \lambda_{\text{box}} \mathcal{L}_{\text{box}} + \lambda_{\text{seg}} \mathcal{L}_{\text{seg}} + \lambda_{\text{cls}} \mathcal{L}_{\text{cls}} + \lambda_{\text{dfl}} \mathcal{L}_{\text{dfl}}$$
where $\lambda_i$ represents the weight coefficient for each loss component.
requirements.txt
ultralytics>=8.0.0
roboflow>=1.0.0
pandas>=1.3.0
# Clone the repository
git clone https://github.com/kemalkilicaslan/Vehicle-Recognition-with-Segmentation-Training-on-a-Custom-Dataset.git
cd Vehicle-Recognition-with-Segmentation-Training-on-a-Custom-Dataset
# Install required packages
pip install -r requirements.txt
Note: This project was developed using Google Colab with GPU acceleration (Tesla T4). For local execution, ensure you have:
Vehicle-Recognition-with-Segmentation-Training-on-a-Custom-Dataset
├── input/
│ ├── car1.jpg - car5.jpg
│ ├── pickup1.jpg - pickup5.jpg
│ └── truck1.jpg - truck5.jpg
├── output/
│ └── (segmented prediction images)
├── Vehicle-Recognition-with-Segmentation-Training-on-a-Custom-Dataset.ipynb
├── confusion_matrix.png
├── results.png
├── README.md
├── requirements.txt
└── LICENSE
Dataset Access:
vehicle-segmentation-yvbo4 (version 5)kemalkilicaslan-bgq6qPre-trained Model:
yolov8x-seg.pt (automatically downloaded during training)from roboflow import Roboflow
rf = Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace("kemalkilicaslan-bgq6q").project("vehicle-segmentation-yvbo4")
version = project.version(5).download("yolov8")
CLI:
yolo task=segment mode=train model=yolov8x-seg.pt \
data="/path/to/vehicle-segmentation-5/data.yaml" \
epochs=50 imgsz=640
Python:
from ultralytics import YOLO
# Initialize model
model = YOLO('yolov8x-seg.pt')
# Train model
model.train(
data='/path/to/vehicle-segmentation-5/data.yaml',
epochs=50,
imgsz=640,
task='segment'
)
yolo task=segment mode=val \
model=/path/to/runs/segment/train/weights/best.pt \
data="/path/to/vehicle-segmentation-5/data.yaml"
CLI:
yolo task=segment mode=predict \
model=/path/to/runs/segment/train/weights/best.pt \
conf=0.85 \
source="/path/to/test/images/*.jpg"
Python:
from ultralytics import YOLO
# Load trained model
model = YOLO('/path/to/best.pt')
# Run prediction
results = model.predict(
source='/path/to/test/images',
conf=0.85,
save=True
)
The model was trained for 50 epochs with the following final metrics:
| Metric | Value |
|---|---|
| Box mAP50 | 99.5% |
| Box mAP50-95 | 90.3% |
| Mask mAP50 | 99.5% |
| Mask mAP50-95 | 87.2% |
| Training Time | 0.308 hours (~18.5 minutes) |
| Class | Precision | Recall | F1 Score | Box mAP50-95 | Mask mAP50-95 |
|---|---|---|---|---|---|
| Car | 99.1% | 100% | 0.995 | 89.7% | 89.9% |
| Pickup | 90.5% | 100% | 0.950 | 91.9% | 81.5% |
| Truck | 100% | 79.6% | 0.886 | 89.4% | 90.3% |
| Overall | 96.5% | 93.2% | 0.948 | 90.3% | 87.2% |
Confusion Matrix:
Training Metrics:
The model was tested on 15 completely new vehicle images (5 per class) to evaluate its real-world performance. All test images were successfully segmented with high confidence scores (>85%), demonstrating the model's robust ability to accurately classify and segment vehicles across various conditions.
Model Capabilities Demonstrated:
Strengths:
Areas for Improvement:
[Raw Vehicle Images]
↓
[Roboflow Annotation] → [Polygon Segmentation Masks]
↓
[Dataset Split] → [70% Train | 20% Val | 10% Test]
↓
[Preprocessing] → [Auto-orient, Resize to 640×640]
↓
[Augmentation] → [Flip, Crop, Rotate, Grayscale, Blur]
↓
[YOLO Format Dataset]
[Pre-trained YOLOv8x-seg]
↓
[Transfer Learning] → [Fine-tune on Custom Dataset]
↓
[50 Epochs Training]
├── Box Loss Optimization
├── Segmentation Loss Minimization
├── Classification Accuracy
└── DFL Loss Reduction
↓
[Model Validation] → [Performance Metrics]
↓
[Best Model Checkpoint] (.pt file)
[Input Vehicle Image]
↓
[Image Preprocessing] → [Resize, Normalize]
↓
[YOLOv8x-seg Model]
├── Backbone Feature Extraction
├── Neck Feature Fusion
└── Head Detection & Segmentation
↓
[Post-processing]
├── Confidence Filtering (>85%)
├── Non-Maximum Suppression
└── Mask Generation
↓
[Output: Segmented Vehicle + Class Label + Confidence]
| Library | Version | Purpose |
|---|---|---|
| ultralytics | 8.3.205 | YOLOv8 implementation and training |
| roboflow | 1.0+ | Dataset management and download |
| torch | 2.8.0 | Deep learning computations |
| opencv-python | 4.12.0 | Image processing |
| pandas | Latest | Data analysis and metrics |
| numpy | 2.0.2 | Numerical computations |
YOLOv8x-seg Architecture:
Training Configuration:
Source Images:
Annotation Method:
All images underwent standardized preprocessing:
| Technique | Parameters | Purpose |
|---|---|---|
| Horizontal Flip | 50% probability | Increase orientation diversity |
| Random Crop | 0-20% zoom | Simulate varying distances |
| Rotation | ±10 degrees | Handle tilted vehicles |
| Grayscale | 100% conversion | Reduce color dependency |
| Blur | 2px | Simulate motion/focus variations |
This project is open source and available under the Apache License 2.0.
Special thanks to:
Note: This project is intended for educational and research purposes. When deploying vehicle recognition systems in production environments, ensure compliance with relevant regulations regarding automated surveillance and data privacy.