GitHub - KashifMoin1410/Amazon-Food-Review-Sentiment-Analysis: A machine learning project for classifying Amazon Fine Food Reviews as positive or negative using text preprocessing, feature extraction, and multiple classification algorithms. Includes EDA, model evaluation, and visualizations. Achieved 89.5% accuracy with Logistic Regression on real-world review data. Dataset: Amazon Fine Food Reviews (Kaggle)

Amazon Food Review Sentiment Analysis

Overview:

This project focuses on performing sentiment analysis on the Amazon Fine Food Reviews dataset. The primary objective is to classify customer reviews as positive or negative using various machine learning techniques. The project encompasses data preprocessing, exploratory data analysis (EDA), feature extraction, model building, and evaluation.

Dataset:

Source: Kaggle - Amazon Fine Food Reviews
Size: 568,454 reviews
Attributes:
- Id: Unique identifier for the review
- ProductId: Unique identifier for the product
- UserId: Unique identifier for the user
- ProfileName: Name of the user
- HelpfulnessNumerator: Number of users who found the review helpful
- HelpfulnessDenominator: Number of users who indicated whether they found the review helpful
- Score: Rating between 1 and 5
- Time: Timestamp for the review
- Summary: Brief summary of the review
- Text: Full text of the review

Objective:

Transform the multiclass rating problem into a binary classification task:

Positive Reviews: Ratings of 4 or 5
Negative Reviews: Ratings of 1 or 2

Note: Reviews with a rating of 3 are considered neutral and are excluded from the analysis.

Methodology:

Data Preprocessing
1. Text Cleaning:
  1. Removal of HTML tags
  2. Conversion to lowercase
  3. Removal of punctuation and special characters
  4. Tokenization
  5. Removal of stop words
  6. Stemming using the Snowball Stemmer
2. Handling Class Imbalance:
  1. Analyzed the distribution of positive and negative reviews
  2. Implemented techniques to address any imbalance if necessary
Exploratory Data Analysis (EDA)
1. Visualized the distribution of review scores
2. Generated word clouds for positive and negative reviews
3. Analyzed the length of reviews and their correlation with sentiment
4. Examined the most frequent words in each sentiment category
Feature Extraction
1. Bag of Words (BoW): Converted text data into numerical vectors based on word frequency
2. Term Frequency-Inverse Document Frequency (TF-IDF): Weighted the importance of words in the corpus
Model Building
1. Implemented and evaluated multiple machine learning models:
  1. Logistic Regression
  2. Support Vector Machine (SVM)
  3. Random Forest Classifier
  4. Naive Bayes Classifier
Model Evaluation
1. Metrics Used:
  1. Accuracy
  2. Precision
  3. Recall
  4. F1-Score
  5. Confusion Matrix
2. Cross-Validation:
  1. Performed k-fold cross-validation to ensure model robustness

Results:

Best Performing Model: Logistic Regression
Accuracy Achieved: 89.5%
Precision: 0.90
Recall: 0.88
F1-Score: 0.89

These metrics indicate that the Logistic Regression model performed well in classifying the sentiment of Amazon food reviews, achieving a balanced trade-off between precision and recall.

Dependencies:

Python 3
pandas
numpy
matplotlib
seaborn
scikit-learn
Nltk

Future Work:

Implement deep learning models like LSTM and BERT for improved accuracy
Deploy the model using Flask or Streamlit for real-time sentiment analysis
Integrate the model into a web application for user-friendly interaction

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
AmazonFoodReview.ipynb		AmazonFoodReview.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Amazon Food Review Sentiment Analysis

Overview:

Dataset:

Objective:

Methodology:

Results:

Dependencies:

Future Work:

Acknowledgements:

About

Uh oh!

Releases

Packages

Languages

KashifMoin1410/Amazon-Food-Review-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Amazon Food Review Sentiment Analysis

Overview:

Dataset:

Objective:

Methodology:

Results:

Dependencies:

Future Work:

Acknowledgements:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages