Skip to content

💳This Python project detects fraudulent credit card transactions using machine learning models on a dataset of 1.85 million records. It applies data preprocessing, class imbalance handling, and model evaluation to identify fraud with high accuracy. Using libraries like scikit-learn, XGBoost, and imbalanced-learn, it offers visual and cost-benefit

Notifications You must be signed in to change notification settings

Sarathnadiminti/Credit-Card-Fraud-Detection-Capstone

Repository files navigation

💳 Credit Card Fraud Detection System (Machine Learning Project)

🚀 Project Overview

This project addresses the rising challenge of credit card fraud by building a machine learning-based fraud detection system. Using a dataset of 1.85 million transactions, the system identifies fraudulent activities with high accuracy while minimizing customer inconvenience. The project includes data analysis, model building, cost-benefit analysis, and actionable insights for the banking sector.


🛠️ Key Features

Data Exploration:

  • Load and inspect transactional data of ~1.85 million records.
  • Understand data features like transaction amounts, merchant details, and fraud labels.

Data Cleaning & Preparation:

  • Handle missing values and imbalanced data.
  • Transform skewed data for better model performance.
  • Create derived features to enhance predictions.

Fraud Detection Model:

  • Experiment with various models, including Logistic Regression, Random Forest, and XGBoost.
  • Perform hyperparameter tuning for optimal results.
  • Address data imbalance using oversampling (SMOTE) and undersampling techniques.

Evaluation & Cost-Benefit Analysis:

  • Evaluate model performance using metrics like precision, recall, and F1-score.
  • Calculate monthly savings for the bank by comparing costs incurred before and after model deployment.
  • Include a second-layer authentication mechanism for flagged transactions.

Visualization:

  • Generate insightful plots to identify trends in fraudulent transactions.
  • Highlight model performance through ROC curves and other visual metrics.

📊 Tech Stack

Programming Language:

  • Python 🐍

Libraries Used:

  • pandas: Data manipulation and preprocessing.
  • numpy: Numerical computations.
  • matplotlib & seaborn: Data visualization.
  • scikit-learn: Model building and evaluation.
  • imbalanced-learn: Handling class imbalance.
  • xgboost: Advanced machine learning algorithms.

📁 Repository Contents

  • fraud_detection.ipynb: Jupyter Notebook containing the Python code, visualizations, and results.
  • transactions_dataset.csv: The dataset used for analysis.
  • README.md: Documentation for the project.
  • presentation.pdf: Business presentation showcasing insights, savings, and recommendations.

💡 Insights from the Project

  • Detected fraudulent transactions with high accuracy and minimal false positives.
  • Identified patterns in fraudulent transactions, such as time of occurrence and transaction amounts.
  • Calculated substantial monthly cost savings by implementing the fraud detection model.
  • Highlighted the importance of a second-layer authentication mechanism for flagged transactions.

🎯 Conclusion

The Credit Card Fraud Detection System provides a robust solution to mitigate unauthorized transactions, saving financial institutions millions in losses. By leveraging machine learning and a cost-effective authentication mechanism, the system ensures secure transactions while maintaining a seamless customer experience.


🤝 Contributions

Contributions are welcome! Feel free to fork this repository, open issues, or submit pull requests to improve the project.

About

💳This Python project detects fraudulent credit card transactions using machine learning models on a dataset of 1.85 million records. It applies data preprocessing, class imbalance handling, and model evaluation to identify fraud with high accuracy. Using libraries like scikit-learn, XGBoost, and imbalanced-learn, it offers visual and cost-benefit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published