Method Details
Details for method 'InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions'
Method overview
name | InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions |
challenge | pixel-level semantic labeling |
details | We use Mask2Former as the segmentation framework, and initialize our InternImage-H model with the pre-trained weights on the 427M joint dataset of public Laion-400M, YFCC-15M, and CC12M. Following common practices, we first pre-train on Mapillary Vistas for 80k iterations, and then fine-tune on Cityscapes for 80k iterations. The crop size is set to 1024×1024 in this experiment. As a result, our InternImage-H achieves 87.0 multi-scale mIoU on the validation set, and 86.1 multi-scale mIoU on the test set. |
publication | InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions Wenhai Wang, Jifeng Dai, Zhe Chen, Zhenhang Huang, Zhiqi Li, Xizhou Zhu, Xiaowei Hu, Tong Lu, Lewei Lu, Hongsheng Li, Xiaogang Wang, Yu Qiao CVPR 2023 https://arxiv.org/abs/2211.05778 |
project page / code | https://github.com/OpenGVLab/InternImage |
used Cityscapes data | fine annotations, coarse annotations |
used external data | Laion-400M, YFCC-15M, CC12M, ImageNet, Mapillary |
runtime | n/a |
subsampling | no |
submission date | November, 2022 |
previous submissions |
Average results
Metric | Value |
---|---|
IoU Classes | 86.0981 |
iIoU Classes | 73.6097 |
IoU Categories | 93.0483 |
iIoU Categories | 84.984 |
Class results
Class | IoU | iIoU |
---|---|---|
road | 98.8583 | - |
sidewalk | 88.8123 | - |
building | 94.867 | - |
wall | 72.478 | - |
fence | 71.1928 | - |
pole | 75.417 | - |
traffic light | 80.8672 | - |
traffic sign | 84.7038 | - |
vegetation | 94.517 | - |
terrain | 75.513 | - |
sky | 96.285 | - |
person | 90.0799 | 78.7711 |
rider | 79.9006 | 66.8356 |
car | 96.7627 | 90.787 |
truck | 85.2668 | 66.8543 |
bus | 95.525 | 76.9166 |
train | 92.6055 | 69.2472 |
motorcycle | 80.0189 | 66.1576 |
bicycle | 82.1929 | 73.3083 |
Category results
Category | IoU | iIoU |
---|---|---|
flat | 98.8731 | - |
nature | 94.2884 | - |
object | 80.4482 | - |
sky | 96.285 | - |
construction | 94.9866 | - |
human | 90.0229 | 79.7355 |
vehicle | 96.4336 | 90.2324 |