Skip to content

A complete and in-depth machine learning resource containing detailed notes, mathematical explanations, Python code, and Jupyter notebooks., and lectures.

License

Notifications You must be signed in to change notification settings

Rudra-G-23/100-Days-of-ML

Repository files navigation

rudra

Typing Effect

GitHub Badge LinkedIn Badge Kaggle Badge


numpy pandas plotly scikit-learn streamlit Seaborn version SciPy version mlxtend version Matplotlib version DTreeViz version


๐Ÿ“˜ Table of Contents


3. Feature Engineering

3.1 ๐Ÿ”ง Feature Transformation

3.1.1 ๐Ÿ“Œ Prerequisite

Topic What You'll Learn Notebook Lecture
What is Feature Engineering โ€“ โ€“ ๐Ÿ”ฅ
Column Transformer How to transform columns ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Sklearn without Pipeline Why avoiding pipelines can cause problems ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Sklearn with Pipeline How to implement sklearn pipelines effectively ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.1.2 ๐Ÿ”ง Encoding Categorical and Numerical Data

Topic What You'll Learn Notebook Lecture
Ordinal Encoding Ordinal categorical data preprocessing using OrdinalEncoder() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
One Hot Encoding Nominal categorical data preprocessing using OneHotEncoder() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Function Transformer Log, reciprocal transformation using FunctionTransformer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Power Transformer Square, square root transformation using PowerTransformer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Binarization Preprocessing with Binarizer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Binning Preprocessing with KBinsDiscretizer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Handling Mixed Variables Processing datasets with both numerical & categorical features ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Handling Date & Time How to work with time and date columns ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.1.3 ๐Ÿ“ Feature Scaling

Topic What You'll Learn Notebook Lecture
Standardization Preprocessing using StandardScaler() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Normalization Preprocessing using MinMaxScaler() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.1.4 ๐Ÿงฉ Handling Missing Data

Topic What You'll Learn Notebook Lecture
Complete Case Analysis Remove NaN values ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Arbitrary Value Imputation (Numerical) Impute with arbitrary value using SimpleImputer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Mean/Median Imputation (Numerical) Impute with mean/median using SimpleImputer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Missing Category Imputation (Categorical) Fill missing with a label using SimpleImputer() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Frequent Value Imputation (Categorical) Replace missing with most frequent value ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Missing Indicator Add binary flag for missing values (MissingIndicator()) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Auto Imputer Parameter Tuning Use GridSearchCV() to optimize imputer settings ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Random Sample Imputation Fill missing values with random samples ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
KNN Imputer Use K-Nearest Neighbors to fill missing values ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Iterative Imputer MICE-style multivariate imputation ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.1.5๐Ÿšจ Handling Outliers

Topic What You'll Learn Notebook Lecture
What is Outliers Introduction to outliers and their impact ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Outlier Removal using Z-Score Removing outliers using Z-Score ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Outlier Removal using IQR Removing outliers using Interquartile Range (IQR) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Outlier Removal using Percentiles Removing outliers using Percentiles ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.2 ๐Ÿ—๏ธ Feature Construction

Topic What You'll Learn Notebook Lecture
Feature Construction and Splitting Extract useful data and split features ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

3.3 ๐Ÿ” Feature Extraction

Topic What You'll Learn Notebook Lecture
Curse of Dimensionality Introduction to the "curse" of high dimensions ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
PCA Geometric Intuition (PCA) Geometric understanding of PCA (Principal Component Analysis) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
PCA Problem Formulation & Solution Formulating and solving PCA problems ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
PCA Step by Step Implementation Implementing PCA step by step ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
PCA + KNN (MNIST Dataset) Apply PCA and KNN on the MNIST dataset ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

4. ๐Ÿ“Š Regression

Topic What You'll Learn Notebook Lecture
Simple LR from Scratch Code implementation from scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Sklearn LR Using LinearRegression() from sklearn ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Regression Metrics Understanding Rยฒ score, MSE, RMSE ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Geometric Intuition Understanding the geometric intuition of MLR ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Multiple LR from Scratch Code implementation from scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Mathematical Formulation Sklearn LR Using LinearRegression() from sklearn ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Polynomial LR Preprocessing and using PolynomialFeatures() ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

5. ๐Ÿง‘โ€๐Ÿ’ป Gradient Descent

Topic What You'll Learn Notebook Lecture
Gradient Descent Basic Introduction to Gradient Descent ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Batch Simple GD Implementing Simple Batch GD from Scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Batch GD Implementing Batch Gradient Descent from Scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Stochastic GD Implementing Stochastic Gradient Descent from Scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Mini Batch GD Implementing Mini-Batch Gradient Descent from Scratch ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

6. ๐Ÿ‘ฎ Regularization

Topic What You'll Learn Notebook Lecture
Bias-Variance Trade-off Understanding Underfitting & Overfitting - ๐Ÿ”ฅ
Ridge Regression Geometric Intuition (Part 1) Introduction to Regularized Linear Models ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Ridge Regression Mathematical Formulation (Part 2) Scratch for slope (m) and intercept (b) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Ridge Regression Mathematical Formulation (Part 2) Full Scratch Implementation ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Ridge Regression (Part 3) Gradient Descent Implementation ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
5 Key Points about Ridge Regression Q&A, Effects, and Insights ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Lasso Regression Full Implementation ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
Why Lasso Regression Creates Sparsity Understanding Sparsity Effect ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
ElasticNet Regression Comparison and Effects ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

7. ๐Ÿ“˜ Logistic Regression

Topic What You'll Learn Notebook Lecture
LR 1 - Perceptron Trick Why to use it, transformations, region concept - ๐Ÿ”ฅ
LR 2 - Perceptron Trick Code Math to algorithm conversion ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
LR 3 - Sigmoid Function How the sigmoid function helps to find the error line ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
LR 4 - Math Behind Optimal Line Maximum likelihood, binary cross-entropy, gradient descent - ๐Ÿ”ฅ
Extra - Derivative of Sigmoid Helps derive matrix form from loss function - ๐Ÿ”ฅ
LR 5 - Logistic Regression (Gradient Descent) Scratch implementation ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
LR 6 - Multinomial Logistic Regression Softmax regression ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
LR 7 - Non-Linear Regression Polynomial features ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
LR 8 - Hyperparameter Sklearn documentation and hyperparameter tuning - ๐Ÿ”ฅ
P1 Classification Metrics Accuracy, confusion matrix, Type I & II errors, binary vs. multi-class ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
P2 Classification Metrics Binary Precision, recall & F1 score (binary) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
P2 Classification Metrics Multi-Class Precision, recall & F1 score (multi-class) ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ

8. ๐ŸŒด Decision Tree

Topic What You'll Learn Notebook Lecture
D1 - Decision Tree Geometric Intuition Entropy, Gini Impurity, Information Gain - ๐Ÿ”ฅ
D2 - Hyperparameters Overfitting and Underfitting ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
D3 - Regression Trees Numerical Points ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ
D4 - Awesome Decision Tree dtreeviz Library ๐Ÿ‘จโ€๐Ÿ’ป ๐Ÿ”ฅ