Face Detection and Person Recognition System

1. Introduction

This project implements a comprehensive computer vision system for face detection and person recognition using OpenCV and YOLO (You Only Look Once) deep learning models. The system offers a wide range of capabilities, from basic face detection in static images to real-time person identification using webcam streams.

There's a growing need for automated face and person recognition across various domains, including security monitoring, attendance automation, photo editing, and access control systems. By combining traditional computer vision methods (like Haar Cascade classifiers) with modern deep learning techniques (such as YOLO), this system delivers both lightweight face detection and advanced person identification capabilities.

The implementation has been designed with modularity in mind, allowing it to process both static images and live video streams. This demonstrates how computer vision can be applied to practical, real-world scenarios. Thanks to its flexible architecture, users can select the most appropriate module for their specific needs—whether that's simple face detection or complex, model-based person recognition.

Core Features:

2. Methodology / Approach

The project employs two distinct approaches for computer vision tasks:

Face Detection: Utilizes OpenCV's Haar Cascade classifier (haarcascade_frontalface_default.xml), a machine learning-based approach where a cascade function is trained with positive and negative images. This traditional method is computationally efficient and suitable for basic face detection tasks.

Person Recognition: Implements YOLO (You Only Look Once) deep learning models, specifically custom-trained versions for identifying specific individuals. The YOLO architecture processes the entire image in a single forward pass, making it suitable for real-time applications while maintaining high accuracy.

2.1 System Architecture

The system is organized into five independent modules, each designed for specific use cases:

  1. Static Image Processing: Face detection in individual photos
  2. Video File Processing: Face detection in pre-recorded videos
  3. Real-time Face Detection: Live face detection using webcam
  4. Trained Model Recognition: Person identification in photos and videos using custom YOLO models
  5. Real-time Person Recognition: Live person identification using webcam

2.2 Implementation Strategy

Each module is implemented as a standalone Python script, allowing flexible deployment based on requirements. The face detection modules use OpenCV's pre-trained Haar Cascade classifier for rapid detection, while person recognition modules leverage Ultralytics YOLO framework with custom-trained models for specific individual identification. All real-time modules include graceful exit mechanisms (press 'q' to quit) and proper resource cleanup.

3. Mathematical Framework

3.1 Haar Cascade Detection Algorithm

The Haar Cascade classifier uses a cascade of weak classifiers to detect faces through sliding window approach:

$$F(x) = \begin{cases} 1 & \text{if } \sum_{i=1}^{n} \alpha_i h_i(x) \geq \theta \\ 0 & \text{otherwise} \end{cases}$$

where:

3.2 Haar-like Features

Rectangular features calculated as difference between sum of pixels in adjacent regions:

$$f_{\text{haar}} = \sum_{\text{white region}} I(x,y) - \sum_{\text{black region}} I(x,y)$$

where $I(x,y)$ represents pixel intensity at position $(x,y)$.

3.3 Integral Image for Fast Computation

The integral image allows fast feature calculation:

$$II(x,y) = \sum_{x' \leq x, y' \leq y} I(x',y')$$

Any rectangular sum can be computed in constant time:

$$\text{Sum} = II(D) + II(A) - II(B) - II(C)$$

where $A, B, C, D$ are corners of the rectangle.

3.4 YOLO Detection Framework

YOLO divides the image into $S \times S$ grid and predicts bounding boxes:

$$P(\text{object}) \times IOU_{\text{pred}}^{\text{truth}} = \text{Confidence Score}$$

Bounding Box Prediction:

$$\text{bbox} = (x, y, w, h, \text{confidence}, c_1, c_2, ..., c_n)$$

where:

3.5 Non-Maximum Suppression (NMS)

YOLO uses NMS to eliminate redundant detections:

$$\text{IoU}(box_i, box_j) = \frac{\text{Area}(box_i \cap box_j)}{\text{Area}(box_i \cup box_j)}$$

Boxes with $\text{IoU} > \text{threshold}$ are suppressed if their confidence is lower than the maximum.

3.6 Loss Function for YOLO Training

The YOLO loss function combines localization, confidence, and classification losses:

$$\mathcal{L} = \lambda_{\text{coord}} \mathcal{L}_{\text{box}} + \lambda_{\text{obj}} \mathcal{L}_{\text{obj}} + \lambda_{\text{noobj}} \mathcal{L}_{\text{noobj}} + \lambda_{\text{class}} \mathcal{L}_{\text{class}}$$

where:

3.7 Performance Metrics

Precision: Proportion of correct positive predictions

$$\text{Precision} = \frac{TP}{TP + FP}$$

Recall: Proportion of actual positives correctly identified

$$\text{Recall} = \frac{TP}{TP + FN}$$

F1 Score: Harmonic mean of precision and recall

$$F_1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$$

Mean Average Precision (mAP):

$$\text{mAP} = \frac{1}{N} \sum_{i=1}^{N} AP_i$$

where $AP_i$ is the average precision for class $i$.

4. Requirements

requirements.txt

numpy>=1.19.0
opencv-python>=4.5.0
ultralytics>=8.0.0

5. Installation & Configuration

5.1 Environment Setup

# Clone the repository
git clone https://github.com/kemalkilicaslan/Face-Detection-and-Person-Recognition-System.git
cd Face-Detection-and-Person-Recognition-System

# Install required packages
pip install -r requirements.txt

5.2 Project Structure

Face-Detection-and-Person-Recognition-System
├── Detect-Faces-in-Photo.py
├── Detect-Faces-in-Video.py
├── Real-Time-Face-Detection.py
├── Real-Time-Person-Recognition.py
├── Person-Recognition-in-Photo-and-Video.py
├── haarcascade_frontalface_default.xml
├── README.md
├── requirements.txt
└── LICENSE

5.3 Required Files

For Face Detection:

For Person Recognition:

6. Usage / How to Run

6.1 Face Detection in Photo

python Detect-Faces-in-Photo.py

Requirements:

6.2 Face Detection in Video

python Detect-Faces-in-Video.py

Requirements:

6.3 Real-Time Face Detection

python Real-Time-Face-Detection.py

Controls:

6.4 Person Recognition in Photo/Video

python Person-Recognition-in-Photo-and-Video.py

Configuration:

6.5 Real-Time Person Recognition

python Real-Time-Person-Recognition.py

Requirements:

Controls:

7. Application / Results

7.1 Face Detection in Photo

Input Image:

Faces in Photo

Output Image:

Detect Faces in Photo

7.2 Person Recognition in Photo

Input Image:

Persons in Photo

Output Image:

Persons Recognition in Photo

7.3 Face Detection in Video

Input Video:

Output Video:

7.4 Person Recognition in Video

Input Video:

Output Video:


7.5 Real-Time Face Detection

Demo Video:

7.6 Real-Time Person Recognition

Demo Video:

7.7 Performance Metrics

Performance varies based on hardware and input resolution:

Metric Face Detection Person Recognition (YOLO)
Processing Speed 30+ FPS 15-30 FPS (CPU), 60+ FPS (GPU)
Detection Accuracy 85-95% 90-98% (with proper training)
False Positive Rate Low (5-10%) Very Low (2-5%)

8. How It Works (Pipeline Overview)

8.1 Face Detection Pipeline (Haar Cascade)

[Camera/Image Input]
          ↓
[Convert to Grayscale]
          ↓
[Apply Histogram Equalization] (optional)
          ↓
[Haar Cascade Detection]
    ├─ Integral Image Computation
    ├─ Sliding Window Search
    ├─ Multi-scale Detection
    └─ Cascade Classifier Evaluation
          ↓
[Filter False Positives]
    ├─ Minimum Size Filter
    ├─ Neighbor Grouping
    └─ Confidence Threshold
          ↓
[Draw Bounding Boxes]
          ↓
[Display/Save Output]

8.2 Person Recognition Pipeline (YOLO)

[Camera/Image Input]
          ↓
[Image Preprocessing]
    ├─ Resize to Model Input Size
    ├─ Normalize Pixel Values
    └─ Channel Conversion (RGB)
          ↓
[YOLO Model Inference]
    ├─ Backbone Feature Extraction
    ├─ Neck Feature Fusion
    └─ Detection Head Prediction
          ↓
[Post-Processing]
    ├─ Confidence Filtering
    ├─ Non-Maximum Suppression
    └─ Class-specific Thresholding
          ↓
[Annotate with Labels & Confidence]
          ↓
[Display/Save Output]

8.3 Algorithm Complexity Analysis

Haar Cascade:

9. Tech Stack

9.1 Core Technologies

9.2 Libraries & Dependencies

Library Version Purpose
opencv-python 4.5+ Image processing, video capture, and face detection
ultralytics 8.0+ YOLO model implementation and inference
numpy 1.19+ Array operations (dependency)

9.3 Pre-trained Models

Haar Cascade Classifier:

Custom YOLO Models:

10. License

This project is open source and available under the Apache License 2.0.

11. References

  1. OpenCV Cascade Classifier Tutorial Documentation.
  2. OpenCV Haar Cascade Classifiers GitHub Repository.
  3. Ultralytics YOLOv8 Documentation.

Acknowledgments

Special thanks to the OpenCV and Ultralytics communities for providing excellent computer vision tools and documentation. Sample images and demonstrations use content from "How I Met Your Mother" for educational purposes only. The Haar Cascade classifier was developed by Viola and Jones (2001), revolutionizing real-time face detection. YOLO architecture continues to evolve, with YOLOv8 representing the latest advancement in unified object detection frameworks.


Note: Ensure you have proper permissions and comply with privacy regulations when using facial recognition technology in production environments. This system is intended for educational and research purposes in controlled settings. Always respect individual privacy rights and obtain appropriate consent before deploying face recognition systems. Consider ethical implications and potential biases in facial recognition technology.