Method Details


Details for method 'Axial-DeepLab-XL [Cityscapes-fine]'

 

Method overview

name Axial-DeepLab-XL [Cityscapes-fine]
challenge panoptic semantic labeling
details Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.
publication Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
ECCV 2020 (spotlight)
https://arxiv.org/abs/2003.07853
project page / code https://github.com/csrhddlam/axial-deeplab
used Cityscapes data fine annotations
used external data ImageNet
runtime n/a
subsampling no
submission date March, 2020
previous submissions

 

Average results

Metric AllThingsStuff
PQ 62.7907 53.7894 69.337
SQ 82.4408 81.036 83.4624
RQ 75.2496 66.3396 81.7296

 

Class results

Class PQ SQ RQ
road 98.5404 98.6054 99.9341
sidewalk 78.4945 85.2444 92.0817
building 88.7657 90.9255 97.6246
wall 38.0846 76.0827 50.0569
fence 41.3978 75.0619 55.1515
pole 62.4002 71.3835 87.4154
traffic light 56.4552 76.9151 73.3993
traffic sign 71.761 80.3194 89.3446
vegetation 90.7623 91.761 98.9116
terrain 45.7121 78.6351 58.1319
sky 90.3335 93.1522 96.9742
person 54.4805 77.5022 70.2954
rider 50.8289 74.3357 68.3775
car 67.7672 85.2207 79.5197
truck 50.2123 88.3296 56.8465
bus 59.3203 88.9804 66.6667
train 55.5731 84.4081 65.8385
motorcycle 48.3147 76.5276 63.1336
bicycle 43.8186 72.9838 60.0388

 

Links

Download results as .csv file

Benchmark page