1. Introduction

This project implements a real-time industrial safety gear detection and compliance monitoring system using a custom-trained YOLO model on the Ultralytics Platform. The system detects workers in industrial environments, identifies their personal protective equipment (PPE), and classifies each individual's compliance status — providing visual warnings for missing safety gear.

Workplace safety enforcement is critical in high-risk industrial environments such as factories, construction sites, and manufacturing facilities. Manual inspection is labor-intensive and error-prone; this system automates PPE compliance monitoring using computer vision, enabling real-time safety audits through existing camera infrastructure.

The implementation demonstrates practical applications of object detection for occupational health and safety, processing video streams to detect and track multiple workers simultaneously while assessing the presence or absence of four essential safety equipment items.

Core Features:

  • Real-time worker detection and centroid-based multi-person tracking
  • 4-class PPE detection: helmet, safety vest, gloves, face mask
  • Per-person equipment assignment via IoU and bounding box center analysis
  • Compliance status classification: Compliant (green), Partial (orange), Non-compliant (red)
  • Visual warning banner for workers wearing no equipment
  • Annotated video recording with per-person equipment labels
  • Custom-trained model: 97.1% mAP50, 94.6% Precision, 94.6% Recall

2. Methodology / Approach

The system employs a custom-trained YOLO model to simultaneously detect persons and safety equipment items within each video frame. A dedicated assignment algorithm then associates detected equipment with the nearest worker using spatial overlap metrics, and a centroid tracker maintains consistent person identities across frames.

2.1 System Architecture

The industrial safety gear detection pipeline consists of:

  1. YOLO Inference: Detect all instances of person, helmet, safety-vest, gloves, and face-mask in each frame
  2. Equipment Assignment: Map detected PPE items to their corresponding worker using IoU and center-in-box tests
  3. Person Tracking: Centroid-based tracker assigns persistent IDs across frames with configurable lost-track tolerance
  4. Compliance Assessment: Compare each worker's detected equipment set against the full required set
  5. Visualization: Color-coded bounding boxes and per-item labels; warning banner for zero-equipment workers
  6. Video Output: Annotated frames written to output video file

2.2 Processing Pipeline

[Video Input]
    ↓
[YOLO Detection] → [Persons + Equipment Detections]
    ↓
[Equipment-to-Person Assignment] (IoU + Center-in-Box)
    ↓
[Centroid Tracker] → [Persistent Track IDs]
    ↓
[Compliance Status Classification]
    ↓
[Bounding Box Rendering + Label Overlay]
    ↓
[Video Output]

2.3 Implementation Strategy

The implementation uses the Ultralytics YOLO framework for inference and OpenCV for video processing and annotation. Equipment assignment is performed using a two-stage scoring system: equipment whose center lies inside a person's bounding box receives an IoU bonus of +1.0, ensuring close spatial association is prioritized. The centroid tracker uses a greedy nearest-neighbour cost matrix weighted by both Euclidean distance and IoU overlap, tolerating up to 30 frames of lost detection before retiring a track.

3. Mathematical Framework

3.1 IoU & Equipment Assignment

Spatial overlap between equipment and person bounding boxes:

$$\text{IoU}(A, B) = \frac{\text{Area}(A \cap B)}{\text{Area}(A \cup B)}$$

Equipment is assigned to the person with the highest IoU score. If the equipment center lies inside the person's bounding box, a +1.0 bonus is added to prioritize close spatial association.

3.2 Centroid Tracking

Bounding box centers are used to match detections across frames:

$$c_x = \frac{x_1 + x_2}{2}, \quad c_y = \frac{y_1 + y_2}{2}$$

$$\delta = \sqrt{(c_x^{d} - c_x^{t})^2 + (c_y^{d} - c_y^{t})^2}$$

Assignments exceeding $\delta > 80$ pixels are rejected.

3.3 Performance Metrics

$$\text{Precision} = \frac{TP}{TP + FP}, \quad \text{Recall} = \frac{TP}{TP + FN}, \quad \text{mAP}_{50} = \frac{1}{N} \sum_{c=1}^{N} AP_c^{50}$$

4. Dataset

Dataset Name: Industrial Safety Gear Detection
Platform: Ultralytics Platform (Public)
License: CC BY-NC-ND 4.0
Total Images: 80
Total Annotations: 1,812
Image Format: JPG
Mean Image Size: 1,750 × 1,050 px (Mean AR: 1.78)
Mean File Size: 461.4 KB

Split Distribution:

Split Images Percentage
Train 64 80.0%
Validation 16 20.0%

Class Distribution:

Index Class Annotations Images
3 gloves 547 (30.2%) 72
0 person 419 (23.1%) 80
1 helmet 363 (20.0%) 80
2 safety-vest 325 (17.9%) 72
4 face-mask 158 (8.7%) 41
Total 1,812 345

Bounding Box Statistics:

  • Mean bounding box width: 114.0 px
  • Mean bounding box height: 169.0 px
  • Annotation locations: distributed across frame center and upper regions

5. Model

Model Name: industrial-safety-gear-detection.pt
Platform: Ultralytics Platform
License: AGPL-3.0
Architecture: YOLO (Ultralytics)
Training Epochs: ~300
Classes: 5 (person, helmet, safety-vest, gloves, face-mask)

Model Metrics:

Metric Value
mAP50 97.1%
mAP50-95 83.7%
Precision 94.6%
Recall 94.6%

Training Convergence:

  • precision(B): stabilizes near 1.0 after ~50 epochs
  • recall(B): stabilizes near 1.0 after ~50 epochs
  • mAP50-95(B): steady convergence to ~0.837 over 300 epochs
  • box_loss, cls_loss, dfl_loss: all converge smoothly with no signs of overfitting

6. Requirements

opencv-python>=4.8.0
numpy>=1.24.0
ultralytics>=8.0.0

7. Installation & Configuration

7.1 Environment Setup

# Clone the repository
git clone https://github.com/kemalkilicaslan/Industrial-Safety-Gear-Detection-System.git
cd Industrial-Safety-Gear-Detection-System

# Install required packages
pip install -r requirements.txt

7.2 Project Structure

Industrial-Safety-Gear-Detection-System/
├── Industrial-Safety-Gear-Detection-System.py
├── README.md
├── requirements.txt
└── LICENSE

7.3 Required Files

  • Custom YOLO Model: industrial-safety-gear-detection.pt (place in project directory)
  • Input Video: Industrial environment video file (MP4, AVI, MOV)

8. Usage / How to Run

8.1 Basic Execution

python Industrial-Safety-Gear-Detection-System.py

8.2 Configuration Parameters

# Model and class configuration
MODEL_PATH = "industrial-safety-gear-detection.pt"
EQUIPMENT_CLASSES = {"helmet", "safety-vest", "gloves", "face-mask"}
PERSON_CLASS = "person"

# Detection thresholds
CONF_THRESH = 0.15    # Confidence threshold for YOLO inference
IOU_THRESH  = 0.15    # Minimum score for equipment-to-person assignment
MAX_LOST    = 30      # Frames before retiring a lost track
DIST_THRESH = 80.0    # Max centroid distance (px) for track matching

8.3 Input / Output

# Update these lines in the script for your video
video_capture = cv2.VideoCapture("Industrial-Safety-Gear.mp4")
output_file   = "Industrial-Safety-Gear-Detection.mp4"

8.4 Controls

  • Press q to quit the application during playback

8.5 Compliance Color Coding

Status Color Condition
Compliant 🟢 Green (50, 220, 50) All 4 equipment items detected
Partial 🟠 Orange (30, 165, 255) At least 1 item detected, at least 1 missing
Non-compliant 🔴 Red (40, 40, 220) No equipment detected — "EQUIPMENT: NONE" banner

9. Application / Results

9.1 Input Video

Industrial Safety Gear:

9.2 Output Video

Industrial Safety Gear Detection:

9.3 Dataset Overview

Dataset & Charts:

Industrial Safety Gear Detection Dataset

Class Distribution:

Industrial Safety Gear Detection Classes

Dataset Charts 1:

Industrial Safety Gear Detection Charts 1

Dataset Charts 2:

Industrial Safety Gear Detection Charts 2

9.4 Model Metrics

Training Metrics & Loss Curves:

Industrial Safety Gear Detection Model Metrics

mAP50: 97.1% | mAP50-95: 83.7% | Precision: 94.6% | Recall: 94.6%

10. Tech Stack

10.1 Core Technologies

  • Programming Language: Python 3.8+
  • Computer Vision: OpenCV 4.8+
  • Deep Learning Framework: Ultralytics YOLO 8.0+
  • Numerical Computing: NumPy 1.24+
  • Training Platform: Ultralytics Platform

10.2 Libraries & Dependencies

Library Version Purpose
opencv-python 4.8+ Video I/O, bounding box rendering, blending
ultralytics 8.0+ YOLO model inference
numpy 1.24+ Array operations, cost matrix computation

10.3 Algorithm Components

Component Method Purpose
Object Detection Custom YOLO Detect persons and 4 PPE classes
Equipment Assignment IoU + Center-in-Box scoring Map PPE to correct worker
Person Tracking Centroid tracker (greedy NN) Maintain persistent worker IDs
Compliance Check Set difference Determine missing equipment
Visualization OpenCV overlay Color-coded boxes and labels

10.4 Detection Parameters

Parameter Value Description
Confidence Threshold 0.15 YOLO inference confidence
IoU Threshold 0.15 Min assignment score
Max Lost Frames 30 Track retirement tolerance
Distance Threshold 80.0 px Max centroid matching distance

11. License

This project is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).

12. References

  1. Ultralytics Ultralytics Platform Documentation — Model training, inference, and deployment.
  2. Ultralytics Platform Industrial Safety Gear Detection Dataset — Dataset annotation, training, and export.
  3. OpenCV Video I/O and Drawing Functions Documentation.

Acknowledgments

Special thanks to the Ultralytics team for the YOLO framework and Ultralytics Platform, which was used for dataset annotation, model training, and export. Thanks to the OpenCV community for providing excellent real-time video processing tools.


Note: This system is designed for research, educational, and authorized industrial safety monitoring purposes. When deploying in production environments, ensure compliance with local privacy regulations regarding workplace video surveillance. Detection accuracy may vary depending on camera angle, occlusion, and lighting conditions. Regular model retraining with site-specific data is recommended for optimal performance.