Skip to content

A machine learning project for satellite image classification using the EuroSAT dataset. Implements classical ML approaches with handcrafted features (HOG, LBP, edge detection) to classify 10 land-use types from Sentinel-2 imagery, demonstrating competitive performance without deep learning.

Notifications You must be signed in to change notification settings

SaurabhJalendra/EuroSAT-LandUse-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EuroSAT Land Use Classifier

A comprehensive machine learning project for satellite image classification using the EuroSAT dataset. This implementation focuses on classical machine learning approaches with handcrafted features, demonstrating an alternative to deep learning methods for remote sensing applications.

Project Overview

This project implements a land-use classification system using traditional machine learning techniques with carefully engineered features extracted from satellite imagery. The approach combines multiple feature extraction methods to create a robust classifier capable of distinguishing between different land-use types.

Key Features

  • Feature Engineering: Combines HOG, LBP, and edge-based descriptors
  • Multiple Classifiers: Implementation and comparison of various ML algorithms
  • Comprehensive Evaluation: Detailed performance analysis with confusion matrices and classification reports
  • Visualization: Complete workflow with image samples and results visualization

Dataset

The project uses the EuroSAT dataset, which contains Sentinel-2 satellite images categorized into 10 distinct land-use classes:

  1. AnnualCrop - Agricultural areas with annual crops
  2. Forest - Forested areas
  3. HerbaceousVegetation - Areas covered with herbaceous vegetation
  4. Highway - Highway infrastructure
  5. Industrial - Industrial facilities and areas
  6. Pasture - Pastoral/grazing areas
  7. PermanentCrop - Areas with permanent crops
  8. Residential - Residential areas and buildings
  9. River - River systems
  10. SeaLake - Sea and lake water bodies

Each image is 64×64 pixels in RGB format, providing a standardized input for feature extraction and classification.

Note: The project includes both the full dataset (EuroSAT_RGB/) with all 10 classes and a subset (EuroSAT_subset/) containing 5 classes for testing and development purposes.

Technical Implementation

Feature Extraction Pipeline

The system employs a sophisticated feature extraction strategy combining three complementary approaches:

1. HOG (Histogram of Oriented Gradients)

  • Purpose: Captures texture and shape information
  • Configuration:
    • Orientations: 9
    • Pixels per cell: (8, 8)
    • Cells per block: (2, 2)
  • Output: Normalized gradient magnitude histograms

2. LBP (Local Binary Patterns)

  • Purpose: Texture characterization and local pattern recognition
  • Configuration:
    • Radius: 3
    • Number of points: 24
    • Method: 'uniform'
  • Output: Histogram of uniform local binary patterns

3. Edge-Based Features

  • Method: Sobel edge detection
  • Components: Both horizontal and vertical edge responses
  • Processing: Mean edge magnitude calculation

Machine Learning Pipeline

Data Preprocessing

  • Loading: Efficient batch processing of image directories
  • Normalization: Pixel value scaling to [0, 1] range
  • Feature Concatenation: Integration of all feature vectors
  • Standardization: Z-score normalization using StandardScaler

Classification Algorithms

The project implements and compares multiple algorithms:

  • Random Forest: Ensemble method with multiple decision trees
  • Support Vector Machine (SVM): With RBF kernel for non-linear classification
  • Logistic Regression: Linear probabilistic classifier
  • Gradient Boosting: Sequential ensemble learning approach

Model Selection & Evaluation

  • Cross-Validation: K-fold validation for robust performance estimation
  • Hyperparameter Tuning: Grid search optimization for best parameters
  • Performance Metrics: Accuracy, precision, recall, and F1-score analysis

Results & Performance

Classification Performance

The project achieves competitive results using classical machine learning approaches:

  • Best Performing Model: Random Forest Classifier
  • Feature Vector Size: Combined features create comprehensive image representations
  • Evaluation Methods:
    • Confusion matrices for detailed class-wise analysis
    • Classification reports with precision/recall metrics
    • Cross-validation scores for model stability assessment

Visualization Components

  • Sample Images: Representative examples from each land-use class
  • Confusion Matrix: Heatmap visualization of classification results
  • Model Predictions: Side-by-side comparison of true vs predicted labels
  • Feature Importance: Analysis of most discriminative features

Getting Started

Prerequisites

python >= 3.7
scikit-learn
scikit-image
matplotlib
numpy
pandas
seaborn
opencv-python

Installation

  1. Clone the repository

  2. Create a virtual environment (recommended):

    python -m venv venv
    # On Windows:
    venv\Scripts\activate
    # On macOS/Linux:
    source venv/bin/activate
  3. Install required dependencies:

    pip install -r requirements.txt

    Or install manually:

    pip install scikit-learn scikit-image matplotlib numpy pandas seaborn opencv-python jupyter
  4. Download the EuroSAT dataset to the project directory

Running the Project

  1. Open the Jupyter notebook: EuroSAT.ipynb
  2. Execute cells sequentially to:
    • Load and explore the dataset
    • Extract features from images
    • Train multiple classifiers
    • Evaluate and compare model performance
    • Visualize results and predictions

Project Structure

EuroSAT-LandUse-Classifier/
├── EuroSAT_RGB/           # Full dataset directory (10 classes)
│   ├── AnnualCrop/        # Annual crop images
│   ├── Forest/            # Forest images
│   ├── HerbaceousVegetation/  # Herbaceous vegetation images
│   ├── Highway/           # Highway images
│   ├── Industrial/        # Industrial area images
│   ├── Pasture/           # Pasture images
│   ├── PermanentCrop/     # Permanent crop images
│   ├── Residential/       # Residential area images
│   ├── River/             # River images
│   └── SeaLake/           # Sea and lake images
├── EuroSAT_subset/        # Subset dataset (5 classes for testing)
│   ├── Forest/            # Forest images subset
│   ├── Highway/           # Highway images subset
│   ├── Industrial/        # Industrial area images subset
│   ├── Residential/       # Residential area images subset
│   └── River/             # River images subset
├── EuroSAT.ipynb          # Main implementation notebook
├── eurosat_paper.pdf      # Reference paper
├── README.md              # Project documentation
├── requirements.txt       # Python dependencies
├── .gitignore             # Git ignore file
└── venv/                  # Virtual environment (optional)

Methodology Highlights

Novel Aspects

  • Multi-Modal Feature Fusion: Combines complementary feature types for enhanced discrimination
  • Classical ML Focus: Demonstrates competitive performance without deep learning
  • Comprehensive Evaluation: Thorough analysis across multiple metrics and visualizations
  • Reproducible Workflow: Complete pipeline from raw images to final predictions

Technical Innovations

  • Feature Engineering Strategy: Carefully designed combination of texture, shape, and edge features
  • Scalable Architecture: Efficient processing pipeline suitable for larger datasets
  • Model Comparison Framework: Systematic evaluation of multiple algorithms

Applications

This classifier can be applied to:

  • Urban Planning: Automated land-use mapping and monitoring
  • Agricultural Monitoring: Crop type identification and yield estimation
  • Environmental Assessment: Habitat mapping and conservation planning
  • Infrastructure Management: Highway and urban development tracking

Future Enhancements

Potential improvements and extensions:

  • Additional Feature Types: Integration of spectral indices and morphological features
  • Ensemble Methods: Advanced combination of multiple feature extractors
  • Temporal Analysis: Multi-temporal image sequence classification
  • Deep Learning Comparison: Benchmarking against CNN-based approaches

References

Based on the EuroSAT dataset and methodology described in the accompanying research paper (eurosat_paper.pdf), which provides the scientific foundation for satellite-based land cover classification.


Note: This implementation serves as an educational example and research baseline for satellite image classification using traditional machine learning approaches, demonstrating that carefully engineered features can achieve competitive results in remote sensing applications.

About

A machine learning project for satellite image classification using the EuroSAT dataset. Implements classical ML approaches with handcrafted features (HOG, LBP, edge detection) to classify 10 land-use types from Sentinel-2 imagery, demonstrating competitive performance without deep learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published