3. Feature Engineering

📘 Table of Contents

3. Feature Engineering
4. 📊 Regression
5. 🧑‍💻 Gradient Descent
6. 👮 Regularization
7. 📘 Logistic Regression
8. 🌴 Decision Tree

3. Feature Engineering

3.1 🔧 Feature Transformation

3.1.1 📌 Prerequisite

Topic	What You'll Learn	Notebook	Lecture
What is Feature Engineering	–	–	🔥
Column Transformer	How to transform columns	👨‍💻	🔥
Sklearn without Pipeline	Why avoiding pipelines can cause problems	👨‍💻	🔥
Sklearn with Pipeline	How to implement sklearn pipelines effectively	👨‍💻	🔥

3.1.2 🔧 Encoding Categorical and Numerical Data

Topic	What You'll Learn	Notebook	Lecture
Ordinal Encoding	Ordinal categorical data preprocessing using `OrdinalEncoder()`	👨‍💻	🔥
One Hot Encoding	Nominal categorical data preprocessing using `OneHotEncoder()`	👨‍💻	🔥
Function Transformer	Log, reciprocal transformation using `FunctionTransformer()`	👨‍💻	🔥
Power Transformer	Square, square root transformation using `PowerTransformer()`	👨‍💻	🔥
Binarization	Preprocessing with `Binarizer()`	👨‍💻	🔥
Binning	Preprocessing with `KBinsDiscretizer()`	👨‍💻	🔥
Handling Mixed Variables	Processing datasets with both numerical & categorical features	👨‍💻	🔥
Handling Date & Time	How to work with time and date columns	👨‍💻	🔥

3.1.3 📏 Feature Scaling

Topic	What You'll Learn	Notebook	Lecture
Standardization	Preprocessing using `StandardScaler()`	👨‍💻	🔥
Normalization	Preprocessing using `MinMaxScaler()`	👨‍💻	🔥

3.1.4 🧩 Handling Missing Data

Topic	What You'll Learn	Notebook	Lecture
Complete Case Analysis	Remove `NaN` values	👨‍💻	🔥
Arbitrary Value Imputation (Numerical)	Impute with arbitrary value using `SimpleImputer()`	👨‍💻	🔥
Mean/Median Imputation (Numerical)	Impute with mean/median using `SimpleImputer()`	👨‍💻	🔥
Missing Category Imputation (Categorical)	Fill missing with a label using `SimpleImputer()`	👨‍💻	🔥
Frequent Value Imputation (Categorical)	Replace missing with most frequent value	👨‍💻	🔥
Missing Indicator	Add binary flag for missing values (`MissingIndicator()`)	👨‍💻	🔥
Auto Imputer Parameter Tuning	Use `GridSearchCV()` to optimize imputer settings	👨‍💻	🔥
Random Sample Imputation	Fill missing values with random samples	👨‍💻	🔥
KNN Imputer	Use K-Nearest Neighbors to fill missing values	👨‍💻	🔥
Iterative Imputer	MICE-style multivariate imputation	👨‍💻	🔥

3.1.5🚨 Handling Outliers

Topic	What You'll Learn	Notebook	Lecture
What is Outliers	Introduction to outliers and their impact	👨‍💻	🔥
Outlier Removal using Z-Score	Removing outliers using Z-Score	👨‍💻	🔥
Outlier Removal using IQR	Removing outliers using Interquartile Range (IQR)	👨‍💻	🔥
Outlier Removal using Percentiles	Removing outliers using Percentiles	👨‍💻	🔥

3.2 🏗️ Feature Construction

Topic	What You'll Learn	Notebook	Lecture
Feature Construction and Splitting	Extract useful data and split features	👨‍💻	🔥

3.3 🔍 Feature Extraction

Topic	What You'll Learn	Notebook	Lecture
Curse of Dimensionality	Introduction to the "curse" of high dimensions	👨‍💻	🔥
PCA Geometric Intuition (PCA)	Geometric understanding of PCA (Principal Component Analysis)	👨‍💻	🔥
PCA Problem Formulation & Solution	Formulating and solving PCA problems	👨‍💻	🔥
PCA Step by Step Implementation	Implementing PCA step by step	👨‍💻	🔥
PCA + KNN (MNIST Dataset)	Apply PCA and KNN on the MNIST dataset	👨‍💻	🔥

4. 📊 Regression

Topic	What You'll Learn	Notebook	Lecture
Simple LR from Scratch	Code implementation from scratch	👨‍💻	🔥
Sklearn LR	Using `LinearRegression()` from `sklearn`	👨‍💻	🔥
Regression Metrics	Understanding R² score, MSE, RMSE	👨‍💻	🔥
Geometric Intuition	Understanding the geometric intuition of MLR	👨‍💻	🔥
Multiple LR from Scratch	Code implementation from scratch	👨‍💻	🔥
Mathematical Formulation Sklearn LR	Using `LinearRegression()` from `sklearn`	👨‍💻	🔥
Polynomial LR	Preprocessing and using `PolynomialFeatures()`	👨‍💻	🔥

5. 🧑‍💻 Gradient Descent

Topic	What You'll Learn	Notebook	Lecture
Gradient Descent	Basic Introduction to Gradient Descent	👨‍💻	🔥
Batch Simple GD	Implementing Simple Batch GD from Scratch	👨‍💻	🔥
Batch GD	Implementing Batch Gradient Descent from Scratch	👨‍💻	🔥
Stochastic GD	Implementing Stochastic Gradient Descent from Scratch	👨‍💻	🔥
Mini Batch GD	Implementing Mini-Batch Gradient Descent from Scratch	👨‍💻	🔥

6. 👮 Regularization

Topic	What You'll Learn	Notebook	Lecture
Bias-Variance Trade-off	Understanding Underfitting & Overfitting	-	🔥
Ridge Regression Geometric Intuition (Part 1)	Introduction to Regularized Linear Models	👨‍💻	🔥
Ridge Regression Mathematical Formulation (Part 2)	Scratch for slope (m) and intercept (b)	👨‍💻	🔥
Ridge Regression Mathematical Formulation (Part 2)	Full Scratch Implementation	👨‍💻	🔥
Ridge Regression (Part 3)	Gradient Descent Implementation	👨‍💻	🔥
5 Key Points about Ridge Regression	Q&A, Effects, and Insights	👨‍💻	🔥
Lasso Regression	Full Implementation	👨‍💻	🔥
Why Lasso Regression Creates Sparsity	Understanding Sparsity Effect	👨‍💻	🔥
ElasticNet Regression	Comparison and Effects	👨‍💻	🔥

7. 📘 Logistic Regression

Topic	What You'll Learn	Notebook	Lecture
LR 1 - Perceptron Trick	Why to use it, transformations, region concept	-	🔥
LR 2 - Perceptron Trick Code	Math to algorithm conversion	👨‍💻	🔥
LR 3 - Sigmoid Function	How the sigmoid function helps to find the error line	👨‍💻	🔥
LR 4 - Math Behind Optimal Line	Maximum likelihood, binary cross-entropy, gradient descent	-	🔥
Extra - Derivative of Sigmoid	Helps derive matrix form from loss function	-	🔥
LR 5 - Logistic Regression (Gradient Descent)	Scratch implementation	👨‍💻	🔥
LR 6 - Multinomial Logistic Regression	Softmax regression	👨‍💻	🔥
LR 7 - Non-Linear Regression	Polynomial features	👨‍💻	🔥
LR 8 - Hyperparameter	Sklearn documentation and hyperparameter tuning	-	🔥
P1 Classification Metrics	Accuracy, confusion matrix, Type I & II errors, binary vs. multi-class	👨‍💻	🔥
P2 Classification Metrics Binary	Precision, recall & F1 score (binary)	👨‍💻	🔥
P2 Classification Metrics Multi-Class	Precision, recall & F1 score (multi-class)	👨‍💻	🔥

8. 🌴 Decision Tree

Topic	What You'll Learn	Notebook	Lecture
D1 - Decision Tree Geometric Intuition	Entropy, Gini Impurity, Information Gain	-	🔥
D2 - Hyperparameters	Overfitting and Underfitting	👨‍💻	🔥
D3 - Regression Trees	Numerical Points	👨‍💻	🔥
D4 - Awesome Decision Tree	`dtreeviz` Library	👨‍💻	🔥

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
003-feature-engineering		003-feature-engineering
004-Regression		004-Regression
005-Gradient-Descent		005-Gradient-Descent
006-Regularization		006-Regularization
007-Logistic-Regression		007-Logistic-Regression
008-Decision-Tree		008-Decision-Tree
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📘 Table of Contents

3. Feature Engineering

3.1 🔧 Feature Transformation

3.1.1 📌 Prerequisite

3.1.2 🔧 Encoding Categorical and Numerical Data

3.1.3 📏 Feature Scaling

3.1.4 🧩 Handling Missing Data

3.1.5🚨 Handling Outliers

3.2 🏗️ Feature Construction

3.3 🔍 Feature Extraction

4. 📊 Regression

5. 🧑‍💻 Gradient Descent

6. 👮 Regularization

7. 📘 Logistic Regression

8. 🌴 Decision Tree

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Rudra-G-23/100-Days-of-ML

Folders and files

Latest commit

History

Repository files navigation

📘 Table of Contents

3. Feature Engineering

3.1 🔧 Feature Transformation

3.1.1 📌 Prerequisite

3.1.2 🔧 Encoding Categorical and Numerical Data

3.1.3 📏 Feature Scaling

3.1.4 🧩 Handling Missing Data

3.1.5🚨 Handling Outliers

3.2 🏗️ Feature Construction

3.3 🔍 Feature Extraction

4. 📊 Regression

5. 🧑‍💻 Gradient Descent

6. 👮 Regularization

7. 📘 Logistic Regression

8. 🌴 Decision Tree

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages