A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes¹, Tuan-Hung Vu^1,2, Andrei Bursuc^1,2, Patrick Pérez³, Raoul de Charette¹
¹ Inria, ² valeo.ai, ³ Kyutai

Project page: https://astra-vision.github.io/FAMix/
Paper: https://arxiv.org/abs/2311.17922

TL; DR: FAMix (for Freeze, Augment, and Mix) is a simple method for domain generalized semantic segmentation, based on minimal fine-tuning, language-driven patch-wise style augmentation, and patch-wise style mixing of original and augmented styles.

Citation

@InProceedings{fahes2024simple,
  title={A Simple Recipe for Language-guided Domain Generalized Segmentation},
  author={Fahes, Mohammad and Vu, Tuan-Hung and Bursuc, Andrei and P{\'e}rez, Patrick and de Charette, Raoul},
  booktitle={CVPR},
  year={2024}
}

Demo

Test on unseen youtube videos in different cities
Training dataset: GTA5
Backbone: ResNet-50
Segmenter: DeepLabv3+

Watch the full video on YouTube

⚠️⚠️Note1: For testing datasets with higher resolution than the one used for training, scaling down the images by a factor of 2 (i.e., scale=0.5) and then upsampling the predictions back to the original resolution speeds up inference and can improve results. Thanks to tpy001 for raising this point in the issues. The scale parameter can be customized when running Evaluation by adding --scale <value>.

⚠️⚠️Note2: One more trick to improve the performance at inference: 1- Predict with a scale=1 (i.e., original size of the input image), 2- Predict with downsampled image (scale=0.5), 3- ensemble the predictions. The code for this is added, it can be activated by adding in --scale <value> and --ensemble in Evaluation.

Results with RN50 backbone and DLv3+ decoder trained on GTA5:

Backbone	Decoder	Scale	Cityscapes	Mapillary	ACDC night	ACDC snow	ACDC rain	ACDC fog
RN50	DLv3+	1	48.51	52.39	15.02	37.38	39.56	40.99
RN50	DLv3+	0.5	48.02	54.00	21.58	38.27	39.53	44.94
RN50	DLv3+	ensemble (1 & 0.5)	50.80	56.04	20.05	40.40	42.10	44.93

Results with RN101 backbone and DLv3+ decoder trained on GTA5:

Backbone	Decoder	Scale	Cityscapes	Mapillary	ACDC night	ACDC snow	ACDC rain	ACDC fog
RN101	DLv3+	1	49.13	53.41	21.28	41.49	42.19	44.30
RN101	DLv3+	0.5	50.06	55.31	23.97	40.34	42.41	44.98
RN101	DLv3+	ensemble (1 & 0.5)	51.46	56.95	24.53	43.33	44.77	47.39

Table of Content

Installation
Running FAMix
Inference & Visualization
License
Acknowledgement

Installation

Dependencies

First create a new conda environment with the required packages:

conda env create --file environment.yml

Then activate environment using:

conda activate famix_env

Datasets

ACDC: Download ACDC images and labels from ACDC. Please follow the dataset directory structure:

<ACDC_DIR>/                   % ACDC dataset root
├── rbg_anon/                 % input image (rgb_anon_trainvaltest.zip)
└── gt/                       % semantic segmentation labels (gt_trainval.zip)

BDD100K: Download BDD100K images and labels from BDD100K. Please follow the dataset directory structure:

<BDD100K_DIR>/              % BDD100K dataset root
├── images/                 % input image
└── labels/                 % semantic segmentation labels

Cityscapes: Follow the instructions in Cityscapes to download the images and semantic segmentation labels. Please follow the dataset directory structure:

<CITYSCAPES_DIR>/             % Cityscapes dataset root
├── leftImg8bit/              % input image (leftImg8bit_trainvaltest.zip)
└── gtFine/                   % semantic segmentation labels (gtFine_trainvaltest.zip)

GTA5: Download GTA5 images and labels from GTA5. Please follow the dataset directory structure:

<GTA5_DIR>/                   % GTA5 dataset root
├── images/                   % input image 
└── labels/                   % semantic segmentation labels

Mapillary: Download Mapillary images and labels from Mapillary. Please follow the dataset directory structure:

<MAPILLARY_DIR>/              % Mapillary dataset root
├── training                  % Training subset 
 └── images                     % input image
 └── labels                     % semantic segmentation labels
├── validation                % Validation subset
 └── images                     % input image
 └── labels                     % semantic segmentation labels

Synthia: Download Synthia images and labels from SYNTHIA-RAND-CITYSCAPES and split it following SPLIT-DATA. Please follow the dataset directory structure:

<SYNTHIA>/                 % Synthia dataset root
├── RGB/                   % input image 
└── GT/                    % semantic segmentation labels

Trained models

The trained models are available here.

Running FAMix

Style mining

python3 patch_PIN.py \
  --dataset <dataset_name> \
  --data_root <dataset_root> \
  --resize_feat \
  --save_dir <path_for_learnt_parameters_saving>

Training

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--total_itrs  40000 \
--batch_size 8 \
--val_interval 750 \
--transfer \
--data_aug \
--ckpts_path <path_to_save_checkpoints> \
--path_for_stats <path_for_mined_styles>

Evaluation

python3 main.py \
--dataset <dataset_name> \
--data_root <dataset_root> \
--ckpt <path_to_tested_model> \
--test_only \
--ACDC_sub <ACDC_subset_if_tested_on_ACDC>

Inference & Visualization

To test any model on any image and visualize the output, please add the images to predict_test directory and run:

python3 predict.py \
--ckpt <ckpt_path> \
--save_val_results_to <directory_for_saved_output_images>

License

FAMix is released under the Apache 2.0 license.

Acknowledgement

The code is based on this implementation of DeepLabv3+, and uses code from CLIP, PODA and RobustNet.

↑ back to top

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
datasets		datasets
demo		demo
metrics		metrics
network		network
utils		utils
README.md		README.md
environment.yml		environment.yml
main.py		main.py
patch_PIN.py		patch_PIN.py
predict.py		predict.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Simple Recipe for Language-guided Domain Generalized Segmentation

Citation

Demo

Table of Content

Installation

Dependencies

Datasets

Trained models

Running FAMix

Style mining

Training

Evaluation

Inference & Visualization

License

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

astra-vision/FAMix

Folders and files

Latest commit

History

Repository files navigation

A Simple Recipe for Language-guided Domain Generalized Segmentation

Citation

Demo

Table of Content

Installation

Dependencies

Datasets

Trained models

Running FAMix

Style mining

Training

Evaluation

Inference & Visualization

License

Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages