Multimodal Multiclass Boosting and Its Application to Cross-modal Retrieval

Publication date: Available online 16 May 2019Source: NeurocomputingAuthor(s): Shixun Wang, Zhi Dou, Deng Chen, Hairong Yu, Yuan Li, Peng PanAbstractAlthough Boosting approach has been proved to be a very successful ensemble learning technology, the conventional ones are limited to two classes or single modality. In this paper, to deal with multiclass setting and heterogeneous modalities, we propose a multimodal multiclass boosting framework called MMBoost, in which the intra-modal semantic information and inter-modal semantic correlation can be captured at the same time. By utilizing the multiclass exponential and logistic loss functions, we further acquire two new versions of MMBoost, namely MMBoost_exp and MMBoost_log. The empirical risk, which simultaneously considers the intra-modal and inter-modal losses, is designed and then minimized by gradient descent in the multidimensional functional spaces. More concretely, the optimization problem is solved in turn for each modality. The posterior probability of semantic category can be naturally attained by applying sigmoid function to the multiclass margin. A series of experiments on the Wiki and NUS-WIDE datasets demonstrate that the performance of our proposed method significantly outperforms those of existing Boosting approaches for cross-modal retrieval.
Source: Neurocomputing - Category: Neuroscience Source Type: research