A Cross-modal Adaptive Gated Fusion Generative Adversarial Network for RGB-D Salient Object Detection

Publication date: Available online 15 January 2020Source: NeurocomputingAuthor(s): Zhengyi Liu, Wei Zhang, Peng ZhaoAbstractSalient object detection in RGB-D images aims to identify the most attractive objects in a pair of color and depth images for the observer. As an important branch of salient object detection, it focuses on solving the following two major challenges: how to achieve cross-modal fusion that is efficient and beneficial for salient object detection; how to effectively extract the information of depth image with relatively poor quality. This paper proposes a cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection by using color and depth images. Specifically, the generator network adopts double-stream encoder-decoder network and receives RGB and depth images at the same time. The proposed depthwise separable residual convolution module is used to deal with deep semantic information, and the processed feature is combined with side-output features of the encoder network progressively. In order to compensate for the shortcoming of poor quality of the depth image, the proposed method adds the cross-modal guidance from the side-output features of the RGB stream to the decoder network of depth stream. The discriminator network adaptively fuses the features of double streams using a gated fusion module, then sends the gated fusion saliency map to the discriminator to distinguish the similarity from ground-truth map. Adversa...
Source: Neurocomputing - Category: Neuroscience Source Type: research