A comprehensive machine learning project for satellite image classification using the EuroSAT dataset. This implementation focuses on classical machine learning approaches with handcrafted features, demonstrating an alternative to deep learning methods for remote sensing applications.
This project implements a land-use classification system using traditional machine learning techniques with carefully engineered features extracted from satellite imagery. The approach combines multiple feature extraction methods to create a robust classifier capable of distinguishing between different land-use types.
- Feature Engineering: Combines HOG, LBP, and edge-based descriptors
- Multiple Classifiers: Implementation and comparison of various ML algorithms
- Comprehensive Evaluation: Detailed performance analysis with confusion matrices and classification reports
- Visualization: Complete workflow with image samples and results visualization
The project uses the EuroSAT dataset, which contains Sentinel-2 satellite images categorized into 10 distinct land-use classes:
- AnnualCrop - Agricultural areas with annual crops
- Forest - Forested areas
- HerbaceousVegetation - Areas covered with herbaceous vegetation
- Highway - Highway infrastructure
- Industrial - Industrial facilities and areas
- Pasture - Pastoral/grazing areas
- PermanentCrop - Areas with permanent crops
- Residential - Residential areas and buildings
- River - River systems
- SeaLake - Sea and lake water bodies
Each image is 64×64 pixels in RGB format, providing a standardized input for feature extraction and classification.
Note: The project includes both the full dataset (EuroSAT_RGB/
) with all 10 classes and a subset (EuroSAT_subset/
) containing 5 classes for testing and development purposes.
The system employs a sophisticated feature extraction strategy combining three complementary approaches:
- Purpose: Captures texture and shape information
- Configuration:
- Orientations: 9
- Pixels per cell: (8, 8)
- Cells per block: (2, 2)
- Output: Normalized gradient magnitude histograms
- Purpose: Texture characterization and local pattern recognition
- Configuration:
- Radius: 3
- Number of points: 24
- Method: 'uniform'
- Output: Histogram of uniform local binary patterns
- Method: Sobel edge detection
- Components: Both horizontal and vertical edge responses
- Processing: Mean edge magnitude calculation
- Loading: Efficient batch processing of image directories
- Normalization: Pixel value scaling to [0, 1] range
- Feature Concatenation: Integration of all feature vectors
- Standardization: Z-score normalization using StandardScaler
The project implements and compares multiple algorithms:
- Random Forest: Ensemble method with multiple decision trees
- Support Vector Machine (SVM): With RBF kernel for non-linear classification
- Logistic Regression: Linear probabilistic classifier
- Gradient Boosting: Sequential ensemble learning approach
- Cross-Validation: K-fold validation for robust performance estimation
- Hyperparameter Tuning: Grid search optimization for best parameters
- Performance Metrics: Accuracy, precision, recall, and F1-score analysis
The project achieves competitive results using classical machine learning approaches:
- Best Performing Model: Random Forest Classifier
- Feature Vector Size: Combined features create comprehensive image representations
- Evaluation Methods:
- Confusion matrices for detailed class-wise analysis
- Classification reports with precision/recall metrics
- Cross-validation scores for model stability assessment
- Sample Images: Representative examples from each land-use class
- Confusion Matrix: Heatmap visualization of classification results
- Model Predictions: Side-by-side comparison of true vs predicted labels
- Feature Importance: Analysis of most discriminative features
python >= 3.7
scikit-learn
scikit-image
matplotlib
numpy
pandas
seaborn
opencv-python
-
Clone the repository
-
Create a virtual environment (recommended):
python -m venv venv # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate
-
Install required dependencies:
pip install -r requirements.txt
Or install manually:
pip install scikit-learn scikit-image matplotlib numpy pandas seaborn opencv-python jupyter
-
Download the EuroSAT dataset to the project directory
- Open the Jupyter notebook:
EuroSAT.ipynb
- Execute cells sequentially to:
- Load and explore the dataset
- Extract features from images
- Train multiple classifiers
- Evaluate and compare model performance
- Visualize results and predictions
EuroSAT-LandUse-Classifier/
├── EuroSAT_RGB/ # Full dataset directory (10 classes)
│ ├── AnnualCrop/ # Annual crop images
│ ├── Forest/ # Forest images
│ ├── HerbaceousVegetation/ # Herbaceous vegetation images
│ ├── Highway/ # Highway images
│ ├── Industrial/ # Industrial area images
│ ├── Pasture/ # Pasture images
│ ├── PermanentCrop/ # Permanent crop images
│ ├── Residential/ # Residential area images
│ ├── River/ # River images
│ └── SeaLake/ # Sea and lake images
├── EuroSAT_subset/ # Subset dataset (5 classes for testing)
│ ├── Forest/ # Forest images subset
│ ├── Highway/ # Highway images subset
│ ├── Industrial/ # Industrial area images subset
│ ├── Residential/ # Residential area images subset
│ └── River/ # River images subset
├── EuroSAT.ipynb # Main implementation notebook
├── eurosat_paper.pdf # Reference paper
├── README.md # Project documentation
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore file
└── venv/ # Virtual environment (optional)
- Multi-Modal Feature Fusion: Combines complementary feature types for enhanced discrimination
- Classical ML Focus: Demonstrates competitive performance without deep learning
- Comprehensive Evaluation: Thorough analysis across multiple metrics and visualizations
- Reproducible Workflow: Complete pipeline from raw images to final predictions
- Feature Engineering Strategy: Carefully designed combination of texture, shape, and edge features
- Scalable Architecture: Efficient processing pipeline suitable for larger datasets
- Model Comparison Framework: Systematic evaluation of multiple algorithms
This classifier can be applied to:
- Urban Planning: Automated land-use mapping and monitoring
- Agricultural Monitoring: Crop type identification and yield estimation
- Environmental Assessment: Habitat mapping and conservation planning
- Infrastructure Management: Highway and urban development tracking
Potential improvements and extensions:
- Additional Feature Types: Integration of spectral indices and morphological features
- Ensemble Methods: Advanced combination of multiple feature extractors
- Temporal Analysis: Multi-temporal image sequence classification
- Deep Learning Comparison: Benchmarking against CNN-based approaches
Based on the EuroSAT dataset and methodology described in the accompanying research paper (eurosat_paper.pdf
), which provides the scientific foundation for satellite-based land cover classification.
Note: This implementation serves as an educational example and research baseline for satellite image classification using traditional machine learning approaches, demonstrating that carefully engineered features can achieve competitive results in remote sensing applications.