Method Details


Details for method 'Axial-DeepLab-XL [Mapillary Vistas]'

 

Method overview

name Axial-DeepLab-XL [Mapillary Vistas]
challenge panoptic semantic labeling
details Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.
publication Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
ECCV 2020 (spotlight)
https://arxiv.org/abs/2003.07853
project page / code https://github.com/csrhddlam/axial-deeplab
used Cityscapes data fine annotations
used external data ImageNet, Mapillary Vistas
runtime n/a
subsampling no
submission date April, 2020
previous submissions

 

Average results

Metric AllThingsStuff
PQ 66.5747 58.7294 72.2803
SQ 83.4589 81.3324 85.0055
RQ 78.9689 72.0244 84.0195

 

Class results

Class PQ SQ RQ
road 98.7233 98.7558 99.967
sidewalk 79.7859 86.2808 92.4724
building 90.3012 91.8719 98.2903
wall 47.5893 78.4082 60.6943
fence 50.0232 78.7827 63.4951
pole 67.5514 73.8766 91.4381
traffic light 58.8692 79.2751 74.2594
traffic sign 73.3658 82.9527 88.443
vegetation 91.4788 92.2023 99.2153
terrain 46.3842 79.1589 58.5963
sky 91.0115 93.4955 97.3432
person 57.1917 78.4114 72.938
rider 54.9082 74.9161 73.2929
car 69.8002 85.4688 81.6674
truck 57.2699 87.6094 65.3697
bus 68.9942 88.4423 78.0105
train 62.6624 85.6386 73.1707
motorcycle 52.4558 76.8973 68.2154
bicycle 46.5525 73.2755 63.5308

 

Links

Download results as .csv file

Benchmark page