Method Details
Details for method 'Vision Transformer Adapter for Dense Predictions'
Method overview
name | Vision Transformer Adapter for Dense Predictions |
challenge | pixel-level semantic labeling |
details | ViT-Adapter-L, BEiT pre-train, multi-scale testing |
publication | Vision Transformer Adapter for Dense Predictions Zhe Chen, Yuchen Duan, Wenhai Wang, Junjun He, Tong Lu, Jifeng Dai, Yu Qiao https://arxiv.org/abs/2205.08534 |
project page / code | https://github.com/czczup/ViT-Adapter |
used Cityscapes data | fine annotations |
used external data | ImageNet, Mapillary |
runtime | n/a |
subsampling | no |
submission date | May, 2022 |
previous submissions |
Average results
Metric | Value |
---|---|
IoU Classes | 85.2055 |
iIoU Classes | 68.2575 |
IoU Categories | 92.8165 |
iIoU Categories | 83.4203 |
Class results
Class | IoU | iIoU |
---|---|---|
road | 98.8827 | - |
sidewalk | 88.4947 | - |
building | 94.4966 | - |
wall | 66.745 | - |
fence | 70.2345 | - |
pole | 74.5308 | - |
traffic light | 80.1767 | - |
traffic sign | 83.5873 | - |
vegetation | 94.3988 | - |
terrain | 73.7165 | - |
sky | 96.1833 | - |
person | 89.674 | 75.8518 |
rider | 79.0404 | 58.9152 |
car | 96.6793 | 91.497 |
truck | 85.5161 | 54.622 |
bus | 94.4201 | 68.165 |
train | 90.4794 | 66.7375 |
motorcycle | 79.8667 | 60.3947 |
bicycle | 81.7815 | 69.8768 |
Category results
Category | IoU | iIoU |
---|---|---|
flat | 98.7941 | - |
nature | 94.0184 | - |
object | 79.6169 | - |
sky | 96.1833 | - |
construction | 94.8586 | - |
human | 89.8844 | 76.6379 |
vehicle | 96.3597 | 90.2027 |