Skip to content

This repository serves for reproducing the results of the paper "Improvements to Image Reconstruction-Based Performance Prediction for Semantic Segmentation in Highly Automated Driving"

Notifications You must be signed in to change notification settings

ifnspaml/PerfPredRecV2

Repository files navigation

Improvements to Image Reconstruction-Based Performance Prediction for Semantic Segmentation in Highly Automated Driving

Andreas Bär, Daniel Kusuma and Tim Fingscheidt

Link to paper

Link to supplementary material

Link to poster

Idea Behind the Paper

System Overview

In this work, we build upon the Performance Prediction tool and propose several methods to reduce the prediction error. In particular, we investigated three approaches to improve the predictive power. Our investigations reveal that the best Pearson correlation between segmentation quality and reconstruction quality does not always lead to the best predictive power. Further, our best combination is able to surpass state of the art in image-only performance prediction on the Cityscapes and KITTI dataset.

Citation

If you find our code helpful or interesting for your research, please consider citing

@InProceedings{Baer2022,
  author    = {Andreas Bär and Marvin Klingner and Jonas Löhdefink and Fabian Hüger and Peter Schlicht and Tim Fingscheidt},
  booktitle = {Proc. of CVPR - Workshops},
  title     = {{Performance Prediction for Semantic Segmentation and by a Self-Supervised Image Reconstruction Decoder}},
  year      = {2022},
  address   = {New Orleans, LA, USA},
  month     = jun,
  pages     = {4399--4408},
}
@InProceedings{Baer2023,
  author    = {Andreas Bär and Daniel Kusuma and Tim Fingscheidt},
  booktitle = {Proc. of CVPR - Workshops},
  title     = {{Improvements to Image Reconstruction-Based Performance Prediction for Semantic Segmentation in Highly Automated Driving}},
  year      = {2023},
  address   = {Vancouver, BC, Canada}
  month     = jun,
  pages     = {219--229},
}

Our Models

Results are reported for the Cityscapes Lindau validation set. Please refer to the paper for the results reported on the KITTI validation set.

SwiftNet-, Monodepth2-, and DeepLabv3+-based semantic segmentation baseline:

model mIoU Download link
RN18-SwiftNet 65.02 model
RN50-SwiftNet 65.18 model
SW-SwiftNet 68.98 model
CN-SwiftNet 74.76 model
- - -
RN18-Monodepth2 60.52 model
- - -
RN50-DeepLabV3+ 64.18 model
SW-DeepLabV3+ 64.65 model
CN-DeepLabV3+ 69.02 model

The following tables include the Peason correlation coefficient, the absolute and the root mean squared prediction error as described in the paper.

The model weights with the lowest prediction errors as reported in the paper:

Decoder Encoder Init. Δd IDLC CS val KIT val Download link
ρ ΔM ΔR ρ ΔM ΔR
SwiftNet ResNet-18 random 2 2,3,4 0.86 7.45 10.40 0.78 7.12 9.03 model
Monodepth2 ResNet-18 segmentation 4 2,4,6,8 0.83 7.63 10.37 0.72 7.09 8.73 model
- - - - - - - - - - - -
SwiftNet ResNet-50 random 3* 4* 0.89 6.93 9.97 0.75 6.25 8.09 model
SwiftNet Swin-T segmentation - 2,3,4 0.70 11.96 14.64 0.59 9.59 11.84 model
SwiftNet ConvNext-T random 4* 2,3,4 0.68 12.56 16.18 0.66 9.15 11.85 model
- - - - - - - - - - - -
DeepLabV3+ ResNet-50 random 1 - 0.81 9.54 12.44 0.74 7.84 9.60 model
DeepLabV3+ Swin-T random 2 4 0.81 10.53 13.29 0.76 9.41 11.50 model
DeepLabV3+ ConvNext-T random - 4 0.62 13.09 16.24 0.50 12.12 14.71 model

Note

There is a typo in the decoder layer identifiers, which is marked with *. The actual identifiers that represent the values should be the *-marked identifier plus 1. For example for the configuration SwiftNet-ResNet-50, the value of $\Delta d = 4$ instead of 3. This applies to each marked identifier.

Prerequisites and Requirements

For more direct reproducibility from the paper we recommend using separate environments: One for non-Transformer-based models (ResNet, ConvNext-T), and the other one for Transformer-based models (Swin-T).

To install the environment via environment.yml, follow the following steps:

conda env create --file environment.yml
source activate swiftnet-pp-v2
pip install "git+https://github.com/ifnspaml/IFN_Dataloader.git"
pip install "git+https://github.com/ifnspaml/TUBSRobustCheck.git"

Analog to above installation for training the Transformer-based models, follow:

conda env create --file environment_tf.yml
source activate swiftnet-pp-v2-tf
pip install "git+https://github.com/ifnspaml/IFN_Dataloader.git"
pip install "git+https://github.com/ifnspaml/TUBSRobustCheck.git"

For reference: The environment.yml was created by exporting the environment via conda env export > environment.yml on our Linux cluster.

Training

For training according to our method, please first use train_seg.py to train the segmentation models and then train_rec.py (loading a trained SwiftNet for semantic segmentation (frozen weights) and train an additional reconstruction decoder). Please refer to train_seg.sh and train_rec.sh for example usages.

Evaluation On Clean And Corrupted Images

For evaluation according to our method, please use eval/eval_attacks_n_noise.py. Please refer to eval/attacks.sh for example usages. After running eval/eval_attacks_n_noise.py you can compute metrics with the generated output file. For metric computation, please use metrics/compute_metrics.py. Example output files can be found in the folder output.

Regression

To perform a regression analysis (predicting mIoU from PSNR) you need to first run eval/eval_attacks_n_noise.py to produce output files containing mIoU and PSNR statistics (e.g., mIoU and PSNR statistics of the Cityscapes validation subset). Next, you need to run metrics/regression.py to perform a regression analysis. Please choose a calibration output file (calibrating the regression) and a regression output file (perform an mIoU prediction). Example output files can be found in the folder output.

License

The original SwiftNet model in this project was developed by Marin Oršić et al. here. The Project was released under the GNU General Public License v3.0. This code modifies some parts of the original code and is therefore also licensed under the GNU General Public License v3.0. Please feel free to use it within the boundaries of this license.

Further, we refer to mmseg's implementation for the DeepLabV3+ model and our previous GitHub repository PerfPredRec.

About

This repository serves for reproducing the results of the paper "Improvements to Image Reconstruction-Based Performance Prediction for Semantic Segmentation in Highly Automated Driving"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published