Skip to content

Commit bd7f9e0

Browse files
Addressed feedback: [concatenating models.ipynb file and updated READ_MEwith EDA]
1 parent 241759d commit bd7f9e0

12 files changed

+6469
-18419
lines changed
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Rick and Morty Character Image Detection/Models/InceptionV3.ipynb

-6,155
This file was deleted.

Rick and Morty Character Image Detection/Models/MobileNetV2.ipynb

-6,131
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,131 @@
1-
# Rick and Morty Character Image Detection #462
1+
# Rick and Morty Image Detection
22

3-
## Objective
4-
Testing various pre-trained such as **VGG16** , **MobileNetV2** and **InceptionV3** models to train , validate and finally predict the **Rick and Morty Image detection dataset** .
3+
## 🎯 Goal
4+
The main purpose of this project is to **detect characters of American sitcom Rick and Morty** from the dataset (mentioned below) using various image detection/recognition models and comparing their accuracy.
55

6-
**Done By -**
7-
Full name:- AaradhyaSingh
8-
Github Id :- https://github.com/kyra-09
9-
Email ID :- [email protected]
10-
Participant Role :- Contributor / GSSOC (Girl Script Summer of Code ) - 2024
6+
## 🧵 Dataset
7+
8+
The link to the dataset is given below :-
9+
10+
**Link :- https://www.kaggle.com/datasets/mriffaud/rick-and-morty**
11+
12+
## 🧾 Description
13+
14+
This project involves the comparative analysis of three Keras image detection models, namely VGG16, InceptionV3, and MobileNet, applied to a specific dataset. The dataset consists of annotated images related to a particular domain, and the objectives include training and evaluating these models to compare their accuracy scores and performance metrics. Additionally, exploratory data analysis (EDA) techniques are employed to understand the dataset's characteristics, explore class distributions, detect imbalances, and identify areas for potential improvement. The methodology encompasses data preparation, model training, evaluation, comparative analysis of accuracy and performance metrics, and visualization of EDA insights.
15+
16+
## 🧮 What I had done!
17+
18+
### 1. Data Loading and Preparation:
19+
Loaded the dataset containing image paths and corresponding labels into a pandas DataFrame for easy manipulation and analysis.
20+
21+
### 2. Exploratory Data Analysis (EDA):
22+
Bar Chart for Label Distribution: Created a bar chart to visualize the frequency distribution of different labels in the dataset.
23+
24+
Pie Chart for Label Distribution: Generated a pie chart to represent the proportion of each label in the dataset.
1125

12-
## Approach For This Project
26+
### 3. Data Analysis:
27+
Counted the number of unique image paths to ensure data uniqueness and quality.
28+
Analyzed the distribution of image paths by label for the top 20 most frequent paths.
29+
Displayed the number of unique values for each categorical column to understand data variety.
30+
Visualized missing values in the dataset using a heatmap to identify and address potential data quality issues.
31+
Summarized and printed the counts of each label.
32+
33+
### 4.Cross-Tabulation:
34+
Created a cross-tabulation table to explore the relationship between image paths and labels.
35+
Plotted a heatmap to visualize the relationship between image paths and labels.
36+
37+
### 5. Image Preprocessing and Model Training:
38+
Loaded and preprocessed the test images, ensuring normalization of pixel values for consistency.
39+
Iterated through multiple models (VGG16, InceptionV3, MobileNet) saved in a directory and made predictions on the test dataset.
40+
Saved the predictions to CSV files for further analysis and comparison.
41+
42+
### 6. Model Prediction Visualization:
43+
Loaded models and visualized their predictions on a sample set of test images to qualitatively assess model performance.
44+
Adjusted image preprocessing for models requiring specific input sizes (e.g., 299x299 for InceptionV3).
45+
46+
## 🚀 Models Implemented
1347

1448
Trained the dataset on various models , each of their summary is as follows :-
1549

1650
### InceptionV3
17-
When implementing the InceptionV3 model in code, we leverage its powerful architecture to enhance our image classification tasks. By loading the pre-trained InceptionV3 model with weights from the ImageNet dataset, we benefit from its extensive knowledge. To fine-tune the model for our specific dataset, we freeze the layers to preserve the learned representations.With its exceptional performance on image classification, InceptionV3 serves as an excellent choice for deep learning practitioners seeking accurate and efficient models.
51+
When implementing the InceptionV3 model in code, we leverage its powerful architecture to enhance our image classification tasks. By loading the pre-trained InceptionV3 model with weights from the ImageNet dataset, we benefit from its extensive knowledge.
52+
53+
**Reason for choosing :-**
54+
lightweighted (92 MB) , better accuracy , less parameters (23.9M) , less inference speed (CPU - 42.2 , GPU - 6.9)
55+
56+
Visualization of Predicted Labels on test set :-
57+
![alt text](../Images/InceptionV3_prediction.png)
1858

1959

2060
### VGG16
21-
I will utilize the VGG16 (Visual Geometry Group) architecture, which have deeper and complex structures. These models are renowned for their exceptional performance on various image recognition tasks. By leveraging the pre-trained weights of VGG, I can benefit from the learned features and fine-tune the network for image segmentation on the Rick na Morty Image Detection dataset.
22-
(I trained the VGG16 model using a dataset of 1000 images due to CPU constraints and validated its performance on a smaller set of 50 images.)
61+
I will utilize the VGG16 (Visual Geometry Group) architecture, which have deeper and complex structures. These models are renowned for their exceptional performance on various image recognition tasks.
62+
63+
**Reason for choosing :-**
64+
Highest accuracy (90.1%) , less depth (16) , inference speed less when using GPU (CPU - 69.5 , GPU - 4.2)
65+
66+
Visualization of Predicted Labels on test set :-
67+
![alt text](../Images/VGG16_prediction.png)
2368

2469

2570
### MobileNetV2
2671
Utilizing transfer learning with the MobileNetV2 model allows us to leverage pre-trained weights, drastically reducing the training time needed for image classification tasks. This strategy is especially beneficial when working with limited training data, as we can capitalize on the comprehensive representations learned by the base model from a vast dataset such as ImageNet.
2772

28-
## Accuracy Result
29-
| Model | Accuracy |
30-
|--------|----------|
31-
| InceptionV3 | 97 |
32-
| VGG16 | 88 |
33-
| MobileNetV2 | 97 |
73+
**Reason for choosing :-**
74+
Very lightweighted (14 MB) , better accuracy, very less parameters (3.5M) , less inference speed when using GPU (CPU - 25.9, GPU - 3.8)
75+
76+
Visualization of Predicted Labels on test set :-
77+
![alt text](../Images/MobileNetV2_prediction.png)
3478

35-
## Conclusion
79+
## 📚 Libraries Needed
3680

37-
**InceptionV3 and MobileNetV2 works better with highest accuracy. (considering I only used few image dataset for VGG16 and all for the above two.)**
81+
1. **NumPy:** Fundamental package for numerical computing.
82+
2. **pandas:** Data analysis and manipulation library.
83+
3. **scikit-learn:** Machine learning library for classification, regression, and clustering.
84+
4. **Matplotlib:** Plotting library for creating visualizations.
85+
5. **Keras:** High-level neural networks API, typically used with TensorFlow backend.
86+
6. **tqdm:** Progress bar utility for tracking iterations.
87+
7. **seaborn:** Statistical data visualization library based on Matplotlib.
88+
89+
## 📊 Exploratory Data Analysis Results
90+
91+
### Bar Chart :-
92+
A bar chart showing the distribution of labels in the training dataset. It visually represents the frequency of each label category, providing an overview of how the labels are distributed across the dataset.
93+
94+
![alt text](../Images/bar.png)
95+
96+
### Pie Chart :-
97+
A pie chart illustrating the distribution of labels in the training dataset. The percentage value displayed on each segment indicates the relative frequency of each label category.
98+
99+
![alt text](../Images/pie.png)
100+
101+
### Image paths distribution :-
102+
Visualizes the distribution of top 20 image paths by label, displays unique values in categorical columns
103+
104+
![alt text](../Images/path_distribution.png)
105+
106+
### Cross-tabulation :-
107+
a cross-tabulation table that shows the relationship between image paths and labels in the dataset. The heatmap visualization highlights the distribution of labels across different image paths, providing insights into the labeling patterns within the dataset.
108+
109+
![alt text](../Images/cross_tabulation.png)
110+
111+
## 📈 Performance of the Models based on the Accuracy Scores
112+
113+
| Models | Accuracy Scores|
114+
|------------ |------------|
115+
|InceptionV3 |96% ( Validation Accuracy: 0.9591)|
116+
|MobileNetV2 | 94% (Validation Accuracy: 0.9411) |
117+
|VGG16 | 96% (Validation Accuracy: 0.9602) |
118+
119+
120+
## 📢 Conclusion
121+
122+
**Accoding to the accuracy scores it can be concluded that InceptionV3 and VGG16 were able to perform good on this dataset.**
123+
124+
Even though on data analysis we found that the distribution of the dataset isn't consistent for all the classes.
125+
126+
## ✒️ Your Signature
127+
128+
Full name:- Aaradhya Singh
129+
Github Id :- https://github.com/kyra-09
130+
Email ID :- [email protected]
131+
Participant Role :- Contributor / GSSOC (Girl Script Summer of Code ) - 2024

0 commit comments

Comments
 (0)