Method Details
Details for method 'Axial-DeepLab-L [Mapillary Vistas]'
Method overview
name | Axial-DeepLab-L [Mapillary Vistas] |
challenge | panoptic semantic labeling |
details | Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes. |
publication | Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen ECCV 2020 (spotlight) https://arxiv.org/abs/2003.07853 |
project page / code | https://github.com/csrhddlam/axial-deeplab |
used Cityscapes data | fine annotations |
used external data | ImageNet, Mapillary Vistas |
runtime | n/a |
subsampling | no |
submission date | March, 2020 |
previous submissions |
Average results
Metric | All | Things | Stuff |
---|---|---|---|
PQ | 65.5536 | 56.8596 | 71.8765 |
SQ | 83.021 | 80.9801 | 84.5053 |
RQ | 78.1363 | 70.0513 | 84.0163 |
Class results
Class | PQ | SQ | RQ |
---|---|---|---|
road | 98.6077 | 98.7379 | 99.8682 |
sidewalk | 79.5288 | 85.7123 | 92.7857 |
building | 89.8975 | 91.6782 | 98.0576 |
wall | 44.058 | 78.1027 | 56.4103 |
fence | 47.5243 | 77.8702 | 61.0301 |
pole | 66.8175 | 72.8537 | 91.7146 |
traffic light | 59.6734 | 78.0383 | 76.4668 |
traffic sign | 72.8981 | 82.0785 | 88.8151 |
vegetation | 91.2771 | 92.0294 | 99.1826 |
terrain | 49.4441 | 78.9204 | 62.6506 |
sky | 90.9151 | 93.5363 | 97.1976 |
person | 55.8872 | 78.0042 | 71.6465 |
rider | 52.8714 | 74.309 | 71.1507 |
car | 68.8168 | 85.3298 | 80.648 |
truck | 55.203 | 87.6557 | 62.9771 |
bus | 64.4532 | 88.8313 | 72.5569 |
train | 62.2083 | 84.8774 | 73.2919 |
motorcycle | 50.5958 | 76.0528 | 66.5272 |
bicycle | 44.8412 | 72.7803 | 61.6118 |