This project presents an intelligent waste classification system that uses Convolutional Neural Networks (CNNs) to automatically sort waste into six predefined groups. It demonstrates how deep learning can be effectively applied in environmental management and automated recycling, improving classification accuracy while reducing human involvement in waste processing operations.
The model has been trained on the TrashNet dataset, which contains 2,527 labeled images across six waste categories: cardboard, glass, metal, paper, plastic, and general trash. The system processes individual images and delivers classification predictions along with their corresponding confidence levels.
Core Features:
The project employs a custom Convolutional Neural Network with the following design:
Network Structure:
Total Parameters: 1,645,830 trainable parameters
All images are resized to 224ร224 pixels with RGB channels (3-channel input). The dataset employs a 90-10 train-validation split. Data augmentation techniques applied to training data include:
Optimizer: Adam
Loss Function: Categorical Cross-Entropy
Evaluation Metrics: Accuracy, Precision, Recall
Callbacks: Early stopping (patience=50) and model checkpoint saving
Training Duration: 50 epochs with early stopping
The convolutional layer applies a set of learnable filters to extract features from the input:
$$\mathbf{Y}_{i,j} = \sigma\left(\sum_{m=0}^{k-1} \sum_{n=0}^{k-1} \mathbf{W}_{m,n} \cdot \mathbf{X}_{i+m, j+n} + b\right)$$
where:
Reduces spatial dimensions while retaining the most prominent features:
$$\mathbf{P}_{i,j} = \max_{m,n \in \text{pool}} \mathbf{Y}_{i \cdot s + m, j \cdot s + n}$$
where:
After flattening, dense layers perform classification:
$$\mathbf{z} = \mathbf{W} \cdot \mathbf{x} + \mathbf{b}$$
$$\mathbf{a} = \sigma(\mathbf{z})$$
where:
Randomly drops neurons during training to prevent overfitting:
$$\mathbf{h}_{\text{dropout}} = \mathbf{h} \odot \mathbf{m}$$
where:
Converts logits to probability distribution for multi-class classification:
$$p_i = \frac{e^{z_i}}{\sum_{j=1}^{C} e^{z_j}}$$
where:
Categorical Cross-Entropy measures the difference between predicted and true distributions:
$$\mathcal{L} = -\sum_{i=1}^{N} \sum_{c=1}^{C} y_{ic} \cdot \log(p_{ic})$$
where:
Accuracy:
$$\text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}}$$
Precision (for class $c$):
$$\text{Precision}_c = \frac{TP_c}{TP_c + FP_c}$$
Recall (for class $c$):
$$\text{Recall}_c = \frac{TP_c}{TP_c + FN_c}$$
F1-Score (for class $c$):
$$F1_c = 2 \times \frac{\text{Precision}_c \times \text{Recall}_c}{\text{Precision}_c + \text{Recall}_c}$$
where:
Horizontal/Vertical Flip:
$$\mathbf{X}_{\text{flip}} = \mathbf{F} \cdot \mathbf{X}$$
where $\mathbf{F}$ is a flipping transformation matrix.
Rotation:
$$\mathbf{X}_{\text{rot}} = \mathbf{R}(\theta) \cdot \mathbf{X}$$
where $\mathbf{R}(\theta)$ is a rotation matrix with angle $\theta$.
Zoom/Shear:
$$\mathbf{X}_{\text{transform}} = \mathbf{T} \cdot \mathbf{X}$$
where $\mathbf{T}$ represents zoom or shear transformation.
Normalization:
$$\mathbf{X}_{\text{norm}} = \frac{\mathbf{X}}{255}, \quad \mathbf{X} \in [0, 255] \Rightarrow \mathbf{X}_{\text{norm}} \in [0, 1]$$
Source: TrashNet (Stanford University)
Total Images: 2,527
Classes: 6
Class Distribution:
Image Specifications: 512ร384 pixels, RGB channels, photographed on white board with natural or room lighting
numpy>=1.19.0
pandas>=1.0.0
seaborn>=0.11.0
matplotlib>=3.3.0
plotly>=5.0.0
scikit-learn>=0.24.0
imutils>=0.5.0
tensorflow>=2.6.0
opencv-python>=4.5.0
# Clone the repository
git clone https://github.com/kemalkilicaslan/Garbage-Classification-with-Convolutional-Neural-Network-CNN.git
cd Garbage-Classification-with-Convolutional-Neural-Network-CNN
# Install dependencies
pip install -r requirements.txt
Garbage-Classification-with-CNN
โโโ Garbage-Classification-with-CNN.ipynb
โโโ README.md
โโโ requirements.txt
โโโ LICENSE
โโโ mymodel.keras
mymodel.kerasfrom google.colab import drive
drive.mount('/content/drive')
# Load and preprocess data
x, labels = load_datasets('/path/to/dataset')
# Create and train model
model = Sequential()
# [Layer definitions...]
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=[Precision(), Recall(), 'acc'])
history = model.fit(train_generator, epochs=50, validation_data=test_generator)
# Single image prediction
img, predictions, predicted_class = model_testing('/path/to/image.jpg')
predicted_label = waste_labels[predicted_class]
confidence = np.max(predictions[0])
predict_random_samples(model, dir_path, num_classes=6)
Example images from the TrashNet dataset showing various waste categories used for training:
Test Set Metrics:
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Cardboard | 0.95 | 0.50 | 0.66 | 40 |
| Glass | 0.56 | 0.70 | 0.62 | 50 |
| Metal | 0.50 | 0.66 | 0.57 | 41 |
| Paper | 0.74 | 0.93 | 0.83 | 59 |
| Plastic | 0.50 | 0.29 | 0.37 | 48 |
| Trash | 0.45 | 0.38 | 0.42 | 13 |
The following visualization displays the model's convergence behavior over 50 epochs, showing both training and validation loss, as well as accuracy metrics:
The confusion matrix reveals classification patterns and misclassification tendencies across waste categories:
Real-world predictions on random test samples demonstrate model performance across all categories:
Prediction Examples:
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Input Image (jpg) โ
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PREPROCESSING AND AUGMENTATION โ
โ โข Resize to 224ร224 RGB โ
โ โข Normalize pixel values [0,1] โ
โ โข Apply augmentation (flip, rotate, shear, zoom) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CNN ARCHITECTURE โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Conv2D Block 1: 32 filters (3ร3) + MaxPool (2ร2) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Conv2D Block 2: 64 filters (3ร3) + MaxPool (2ร2) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Conv2D Block 3: 32 filters (3ร3) + MaxPool (2ร2) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Flatten Layer โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Dense Layer 1: 64 units + ReLU + Dropout (0.2) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Dense Layer 2: 32 units + ReLU + Dropout (0.2) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Output Layer: 6 units + Softmax โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CLASSIFICATION OUTPUT โ
โ โข Probability distribution across 6 classes โ
โ โข Predicted class: argmax(probabilities) โ
โ โข Confidence score: max(probabilities) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Output: Class Label + Confidence โ
โ (Cardboard, Glass, Metal, Paper, Plastic, Trash) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Programming Language: Python 3.6+
Core Libraries:
Development Environment:
This project is open source and available under the Apache License 2.0.
This project uses the TrashNet dataset created by Stanford University students Mindy Yang and Gary Thung. Special thanks to the TensorFlow and OpenCV communities for providing excellent deep learning and computer vision tools. The dataset was prepared for recyclability classification research and is used here for educational purposes.
Note: This project is intended for educational and research purposes. The model's performance (62.2% accuracy) demonstrates the practical challenges of real-world waste classification and suggests opportunities for improvement through enhanced training data, data augmentation, and transfer learning approaches such as fine-tuning pre-trained models (ResNet, MobileNet). When deploying waste classification systems in production environments, consider using larger datasets, advanced architectures, and regular model retraining to maintain accuracy.