Method Details


Details for method 'InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions'

 

Method overview

name InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
challenge pixel-level semantic labeling
details We use Mask2Former as the segmentation framework, and initialize our InternImage-H model with the pre-trained weights on the 427M joint dataset of public Laion-400M, YFCC-15M, and CC12M. Following common practices, we first pre-train on Mapillary Vistas for 80k iterations, and then fine-tune on Cityscapes for 80k iterations. The crop size is set to 1024×1024 in this experiment. As a result, our InternImage-H achieves 87.0 multi-scale mIoU on the validation set, and 86.1 multi-scale mIoU on the test set.
publication InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Wenhai Wang, Jifeng Dai, Zhe Chen, Zhenhang Huang, Zhiqi Li, Xizhou Zhu, Xiaowei Hu, Tong Lu, Lewei Lu, Hongsheng Li, Xiaogang Wang, Yu Qiao
CVPR 2023
https://arxiv.org/abs/2211.05778
project page / code https://github.com/OpenGVLab/InternImage
used Cityscapes data fine annotations, coarse annotations
used external data Laion-400M, YFCC-15M, CC12M, ImageNet, Mapillary
runtime n/a
subsampling no
submission date November, 2022
previous submissions

 

Average results

Metric Value
IoU Classes 86.0981
iIoU Classes 73.6097
IoU Categories 93.0483
iIoU Categories 84.984

 

Class results

Class IoU iIoU
road 98.8583 -
sidewalk 88.8123 -
building 94.867 -
wall 72.478 -
fence 71.1928 -
pole 75.417 -
traffic light 80.8672 -
traffic sign 84.7038 -
vegetation 94.517 -
terrain 75.513 -
sky 96.285 -
person 90.0799 78.7711
rider 79.9006 66.8356
car 96.7627 90.787
truck 85.2668 66.8543
bus 95.525 76.9166
train 92.6055 69.2472
motorcycle 80.0189 66.1576
bicycle 82.1929 73.3083

 

Category results

Category IoU iIoU
flat 98.8731 -
nature 94.2884 -
object 80.4482 -
sky 96.285 -
construction 94.9866 -
human 90.0229 79.7355
vehicle 96.4336 90.2324

 

Links

Download results as .csv file

Benchmark page