Method Details

Details for method 'NCTU-ITRI'

name	NCTU-ITRI
challenge	pixel-level semantic labeling
details	For the purpose of fast semantic segmentation, we design a CNN-based encoder-decoder architecture, which is called DSNet. The encoder part is constructed based on the concept of DenseNet, and a simple decoder is adopted to make the network more efficient without degrading the accuracy. We pre-train the encoder network on the ImageNet dataset. Then, only the fine-annotated Cityscapes dataset (2975 training images) is used to train the complete DSNet. The DSNet demonstrates a good trade-off between accuracy and speed. It can process 68 frames per second on 1024x512 resolution images on a single GTX 1080 Ti GPU.
publication	Anonymous
project page / code
used Cityscapes data	fine annotations
used external data	ImageNet
runtime	0.0147 s GTX 1080 Ti
subsampling	2
submission date	July, 2018
previous submissions