Multi-Resolution Model Compression for Deep Neural Networks: A Variational Bayesian Approach

The continuously growing size of deep neural networks (DNNs) has sparked a surge in research on model compression techniques. Among these techniques, multi-resolution model compression has emerged as a promising approach which can generate multiple DNN models with shared weights and different computational complexity (resolution) through a single training. However, in most existing multi-resolution compression methods, the model structures for different resolutions are either predefined or uniformly controlled. This can lead to performance degradation as they fail to implement systematic compression to achieve the optimal model for each resolution. In this paper, we propose to perform multi-resolution compression from a Bayesian perspective. We design a resolution-aware likelihood and a two-layer prior for the channel masks, which allow joint optimization of the shared weights and the model structure of each resolution. To solve the resulted Bayesian inference problem, we develop a low complexity partial update block variational Bayesian inference (PUB-VBI) algorithm. Furthermore, we extend our proposed method into the arbitrary resolution case by proposing an auxiliary neural network (NN) to learn the mapping from the input resolution to the corresponding channel masks. Simulation results show that our proposed method can outperform the baselines on various NN models and datasets.
Source: IEEE Transactions on Signal Processing - Category: Biomedical Engineering Source Type: research