Method Details
Details for method 'NCTU-ITRI'
Method overview
name | NCTU-ITRI |
challenge | pixel-level semantic labeling |
details | For the purpose of fast semantic segmentation, we design a CNN-based encoder-decoder architecture, which is called DSNet. The encoder part is constructed based on the concept of DenseNet, and a simple decoder is adopted to make the network more efficient without degrading the accuracy. We pre-train the encoder network on the ImageNet dataset. Then, only the fine-annotated Cityscapes dataset (2975 training images) is used to train the complete DSNet. The DSNet demonstrates a good trade-off between accuracy and speed. It can process 68 frames per second on 1024x512 resolution images on a single GTX 1080 Ti GPU. |
publication | Anonymous |
project page / code | |
used Cityscapes data | fine annotations |
used external data | ImageNet |
runtime | 0.0147 s GTX 1080 Ti |
subsampling | 2 |
submission date | July, 2018 |
previous submissions |
Average results
Metric | Value |
---|---|
IoU Classes | 69.1013 |
iIoU Classes | 41.4343 |
IoU Categories | 86.7618 |
iIoU Categories | 70.8417 |
Class results
Class | IoU | iIoU |
---|---|---|
road | 97.9662 | - |
sidewalk | 82.0702 | - |
building | 90.3721 | - |
wall | 42.3574 | - |
fence | 45.8075 | - |
pole | 56.3838 | - |
traffic light | 61.1151 | - |
traffic sign | 66.6427 | - |
vegetation | 91.7103 | - |
terrain | 70.0256 | - |
sky | 94.2527 | - |
person | 77.2477 | 56.8646 |
rider | 59.1464 | 33.1569 |
car | 93.2486 | 85.5851 |
truck | 49.5487 | 19.2678 |
bus | 59.3626 | 33.2648 |
train | 56.3247 | 31.9048 |
motorcycle | 53.5447 | 27.0271 |
bicycle | 65.7977 | 44.4035 |
Category results
Category | IoU | iIoU |
---|---|---|
flat | 98.2682 | - |
nature | 91.4397 | - |
object | 62.8912 | - |
sky | 94.2527 | - |
construction | 90.436 | - |
human | 77.5916 | 58.3859 |
vehicle | 92.453 | 83.2974 |