Method Details
Details for method 'NCTU-ITRI'
Method overview
| name | NCTU-ITRI |
| challenge | pixel-level semantic labeling |
| details | For the purpose of fast semantic segmentation, we design a CNN-based encoder-decoder architecture, which is called DSNet. The encoder part is constructed based on the concept of DenseNet, and a simple decoder is adopted to make the network more efficient without degrading the accuracy. We pre-train the encoder network on the ImageNet dataset. Then, only the fine-annotated Cityscapes dataset (2975 training images) is used to train the complete DSNet. The DSNet demonstrates a good trade-off between accuracy and speed. It can process 68 frames per second on 1024x512 resolution images on a single GTX 1080 Ti GPU. |
| publication | Anonymous |
| project page / code | |
| used Cityscapes data | fine annotations |
| used external data | ImageNet |
| runtime | 0.0147 s GTX 1080 Ti |
| subsampling | 2 |
| submission date | July, 2018 |
| previous submissions |
Average results
| Metric | Value |
|---|---|
| IoU Classes | 69.1013 |
| iIoU Classes | 41.4343 |
| IoU Categories | 86.7618 |
| iIoU Categories | 70.8417 |
Class results
| Class | IoU | iIoU |
|---|---|---|
| road | 97.9662 | - |
| sidewalk | 82.0702 | - |
| building | 90.3721 | - |
| wall | 42.3574 | - |
| fence | 45.8075 | - |
| pole | 56.3838 | - |
| traffic light | 61.1151 | - |
| traffic sign | 66.6427 | - |
| vegetation | 91.7103 | - |
| terrain | 70.0256 | - |
| sky | 94.2527 | - |
| person | 77.2477 | 56.8646 |
| rider | 59.1464 | 33.1569 |
| car | 93.2486 | 85.5851 |
| truck | 49.5487 | 19.2678 |
| bus | 59.3626 | 33.2648 |
| train | 56.3247 | 31.9048 |
| motorcycle | 53.5447 | 27.0271 |
| bicycle | 65.7977 | 44.4035 |
Category results
| Category | IoU | iIoU |
|---|---|---|
| flat | 98.2682 | - |
| nature | 91.4397 | - |
| object | 62.8912 | - |
| sky | 94.2527 | - |
| construction | 90.436 | - |
| human | 77.5916 | 58.3859 |
| vehicle | 92.453 | 83.2974 |
