Abstract:In this paper, a deep learning-based image semantic segmentation method was studied. A neural network trained by point pair annotations of relative depth was used to predict depth images from common color images. By feeding the color and depth images into a fully convolutional networks with atrous convolution, accurate segmentation of the images could be obtained. As different representations of object properties, concatenate operation on the feature maps instead of traditional adding operation was used to fuse them. The differences between these two representations could be preserved when they were feed into the next convolutional layers. Experimental results on two different datasets show that, performance of semantic segmentation can be improved by the proposed method.