Method Details
Details for method 'Axial-DeepLab-XL [Mapillary Vistas]'
Method overview
name | Axial-DeepLab-XL [Mapillary Vistas] |
challenge | panoptic semantic labeling |
details | Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes. |
publication | Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen ECCV 2020 (spotlight) https://arxiv.org/abs/2003.07853 |
project page / code | https://github.com/csrhddlam/axial-deeplab |
used Cityscapes data | fine annotations |
used external data | ImageNet, Mapillary Vistas |
runtime | n/a |
subsampling | no |
submission date | April, 2020 |
previous submissions |
Average results
Metric | All | Things | Stuff |
---|---|---|---|
PQ | 66.5747 | 58.7294 | 72.2803 |
SQ | 83.4589 | 81.3324 | 85.0055 |
RQ | 78.9689 | 72.0244 | 84.0195 |
Class results
Class | PQ | SQ | RQ |
---|---|---|---|
road | 98.7233 | 98.7558 | 99.967 |
sidewalk | 79.7859 | 86.2808 | 92.4724 |
building | 90.3012 | 91.8719 | 98.2903 |
wall | 47.5893 | 78.4082 | 60.6943 |
fence | 50.0232 | 78.7827 | 63.4951 |
pole | 67.5514 | 73.8766 | 91.4381 |
traffic light | 58.8692 | 79.2751 | 74.2594 |
traffic sign | 73.3658 | 82.9527 | 88.443 |
vegetation | 91.4788 | 92.2023 | 99.2153 |
terrain | 46.3842 | 79.1589 | 58.5963 |
sky | 91.0115 | 93.4955 | 97.3432 |
person | 57.1917 | 78.4114 | 72.938 |
rider | 54.9082 | 74.9161 | 73.2929 |
car | 69.8002 | 85.4688 | 81.6674 |
truck | 57.2699 | 87.6094 | 65.3697 |
bus | 68.9942 | 88.4423 | 78.0105 |
train | 62.6624 | 85.6386 | 73.1707 |
motorcycle | 52.4558 | 76.8973 | 68.2154 |
bicycle | 46.5525 | 73.2755 | 63.5308 |