Method Details


Details for method 'Axial-DeepLab-L [Cityscapes-fine]'

 

Method overview

name Axial-DeepLab-L [Cityscapes-fine]
challenge panoptic semantic labeling
details Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.
publication Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
ECCV 2020 (spotlight)
https://arxiv.org/abs/2003.07853
project page / code https://github.com/csrhddlam/axial-deeplab
used Cityscapes data fine annotations
used external data ImageNet
runtime n/a
subsampling no
submission date March, 2020
previous submissions

 

Average results

Metric AllThingsStuff
PQ 62.69 53.3849 69.4572
SQ 82.2246 80.7395 83.3047
RQ 75.3426 66.0351 82.1118

 

Class results

Class PQ SQ RQ
road 98.4463 98.5762 99.8682
sidewalk 77.6845 84.9788 91.4163
building 88.5715 90.8831 97.4565
wall 37.9409 76.319 49.7136
fence 40.6441 75.4074 53.8993
pole 63.3451 70.8862 89.3617
traffic light 59.4405 75.3165 78.921
traffic sign 71.6372 80.139 89.3912
vegetation 90.7028 91.608 99.0119
terrain 45.1792 79.0123 57.18
sky 90.4375 93.225 97.01
person 53.8033 77.568 69.3628
rider 50.8249 73.2406 69.3944
car 66.7299 85.2538 78.2721
truck 47.5902 87.3097 54.5073
bus 61.1182 89.0364 68.6441
train 57.6522 83.8577 68.75
motorcycle 46.9832 77.0873 60.9481
bicycle 42.3776 72.5623 58.4017

 

Links

Download results as .csv file

Benchmark page