Ming-Yang Zhang, Xin-Yi Yu, Lin-Lin Ou. Effective Model Compression via Stage-wise Pruning. Machine Intelligence Research, vol. 20, no. 6, pp.937-951, 2023. https://doi.org/10.1007/s11633-022-1357-9
Citation: Ming-Yang Zhang, Xin-Yi Yu, Lin-Lin Ou. Effective Model Compression via Stage-wise Pruning. Machine Intelligence Research, vol. 20, no. 6, pp.937-951, 2023. https://doi.org/10.1007/s11633-022-1357-9

Effective Model Compression via Stage-wise Pruning

doi: 10.1007/s11633-022-1357-9
More Information
  • Author Bio:

    Ming-Yang Zhang received the B. Sc. degree in automation from Zhejiang University City College, China in 2017. He is currently a Ph. D. degree candidate in control theory and engineering at Department of Information and Engineering, Zhejiang University of Technology, China. His research interests include model compression, neural architecture search and machine learning. E-mail: 1111903012@zjut.edu.cn (Corresponding author) ORCID iD: 0000-0001-7862-0566

    Xin-Yi Yu received the B. Sc. degree from Harbin University of Science and Technology (HUST), China in 2002, the M. Sc. degree from HUST, China in 2005 and the Ph. D. degree from Harbin Institute of Technology, China in 2009. His research interest is robotics and automation, especially the development and industrialization of industrial robots.E-mail: yuxy@zjut.edu.cn

    Lin-Lin Ou received the Ph. D. degree in control theory and engineering from Shanghai Jiao Tong University, China in 2006. She is currently a professor with Department of Automation, Zhejiang University of Technology, China. Her research interests include theoretical aspects of time-delayed control systems, applications to industrial process control, robot control and cooperative control.E-mail: linlinou@zjut.edu.cn

  • Received Date: 2022-05-11
  • Accepted Date: 2022-07-14
  • Publish Date: 2023-12-01
  • Automated machine learning (AutoML) pruning methods aim at searching for a pruning strategy automatically to reduce the computational complexity of deep convolutional neural networks (deep CNNs). However, some previous work found that the results of many Auto-ML pruning methods cannot even surpass the results of the uniformly pruning method. In this paper, the ineffectiveness of Auto-ML pruning, which is caused by unfull and unfair training of the supernet, is shown. A deep supernet suffers from unfull training because it contains too many candidates. To overcome the unfull training, a stage-wise pruning (SWP) method is proposed, which splits a deep supernet into several stage-wise supernets to reduce the candidate number and utilize inplace distillation to supervise the stage training. Besides, a wide supernet is hit by unfair training since the sampling probability of each channel is unequal. Therefore, the fullnet and the tinynet are sampled in each training iteration to ensure that each channel can be overtrained. Remarkably, the proxy performance of the subnets trained with SWP is closer to the actual performance than that of most of the previous AutoML pruning work. Furthermore, experiments show that SWP achieves the state-of-the-art in both CIFAR-10 and ImageNet under the mobile setting.

     

  • 11 The multiprocessing module is applied to start the child process and execute our customized tasks in the child process. https://github.com/python/cpython/tree/3.9/Lib/multiprocessing/
  • loading
  • [1]
    A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications, [Online], Available: http://arxiv.org/abs/1704.04861, 2017.
    [2]
    M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510–4520, 2018. DOI: 10.1109/CVPR.2018.00474.
    [3]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
    [4]
    G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 2261–2269, 2017. DOI: 10.1109/CVPR.2017.243.
    [5]
    Y. He, P. Liu, Z. W. Wang, Z. L. Hu, Y. Yang. Filter pruning via geometric median for deep convolutional neural networks acceleration. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4335–4344, 2019. DOI: 10.1109/CVPR.2019.00447.
    [6]
    M. A. Carreira-Perpinan, Y. Idelbayev. “Learning-Compression” algorithms for neural net pruning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8532–8541, 2018. DOI: 10.1109/CVPR.2018.00890.
    [7]
    Y. H. He, J. Lin, Z. J. Liu, H. R. Wang, L. J. Li, S. Han. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 815–832, 2018. DOI: 10.1007/978-3-030-01234-2_48.
    [8]
    Z. C. Liu, H. Y. Mu, X. Y. Zhang, Z. C. Guo, X. Yang, K. T. Cheng, J. Sun. MetaPruning: Meta learning for automatic neural network channel pruning. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 3295–3304, 2019. DOI: 10.1109/ICCV.2019.00339.
    [9]
    J. H. Yu, T. Huang. AutoSlim: Towards one-shot architecture search for channel numbers, [Online], Available: http://arxiv.org/abs/1903.11728, 2019.
    [10]
    Z. C. Guo, X. Y. Zhang, H. Y. Mu, W. Heng, Z. C. Liu, Y. C. Wei, J. Sun. Single path one-shot neural architecture search with uniform sampling. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 544–560, 2020. DOI: 10.1007/978-3-030-58517-4_32.
    [11]
    H. Cai, C. Gan, T. Z. Wang, Z. K. Zhang, S. Han. Once-for-all: Train one network and specialize it for efficient deployment. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [12]
    H. X. Liu, K. Simonyan, Y. M. Yang. DARTS: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [13]
    Q. G. Huang, K. Zhou, S. Y. You, U. Neumann. Learning to prune filters in convolutional neural networks. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, USA, pp. 709–718, 2018. DOI: 10.1109/WACV.2018.00083.
    [14]
    J. H. Yu, L. J. Yang, N. Xu, J. C. Yang, T. S. Huang. Slimmable neural networks. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [15]
    J. H. Yu, T. Huang. Universally slimmable networks and improved training techniques. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 1803–1811, 2019. DOI: 10.1109/ICCV.2019.00189.
    [16]
    X. X. Chu, B. Zhang, R. J. Xu. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 12219–12228, 2021. DOI: 10.1109/ICCV48922.2021.01202.
    [17]
    C. L. Li, J. F. Peng, L. C. Yuan, G. R. Wang, X. D. Liang, L. Lin, X. J. Chang. Block-wisely supervised neural architecture search with knowledge distillation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1986–1995, 2020. DOI: 10.1109/CVPR42600.2020.00206.
    [18]
    Y. Liu, X. H. Jia, M. X. Tan, R. Vemulapalli, Y. K. Zhu, B. Green, X. G. Wang. Search to distill: Pearls are everywhere but not the eyes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7536–7545, 2020. DOI: 10.1109/CVPR42600.2020.00756.
    [19]
    K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition, [Online], Available: https://arxiv.org/abs/1409.1556, 2014.
    [20]
    Y. Q. Liu, Y. N. Sun, B. Xue, M. J. Zhang, G. G. Yen, K. C. Tan. A survey on evolutionary neural architecture search. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: 10.1109/TNNLS.2021.3100554.
    [21]
    B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [22]
    B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697–8710, 2018. DOI: 10.1109/CVPR.2018.00907.
    [23]
    E. Real, A. Aggarwal, Y. P. Huang, Q. V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, USA, Article number 587, 2019. DOI: 10.1609/aaai.v33i01.33014780.
    [24]
    Y. H. Xu, L. X. Xie, X. P. Zhang, X. Chen, G. J. Qi, Q. Tian, H. K. Xiong. PC-DARTS: Partial channel connections for memory-efficient architecture search, [Online], Available: http://arxiv.org/abs/1907.05737, 2019.
    [25]
    X. Chen, L. X. Xie, J. Wu, Q. Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 1294–1303, 2019. DOI: 10.1109/ICCV.2019.00138.
    [26]
    C. X. Yan, X. J. Chang, Z. H. Li, W. L. Guan, Z. Y. Ge, L. Zhu, Q. H. Zheng. ZeroNAS: Differentiable generative adversarial networks search for zero-shot learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: 10.1109/TPAMI.2021.3127346.
    [27]
    H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, J. Dean. Efficient neural architecture search via parameter sharing. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 4092–4101, 2018.
    [28]
    W. Jia, W. Xia, Y. Zhao, H. Min, Y. X. Chen. 2D and 3D palmprint and palm vein recognition based on neural architecture search. International Journal of Automation and Computing, vol. 18, no. 3, pp. 377–409, 2021. DOI: 10.1007/s11633-021-1292-1.
    [29]
    P. Z. Ren, Y. Xiao, X. J. Chang, P. Y. Huang, Z. H. Li, X. J. Chen, X. Wang. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys, vol. 54, no. 4, Article number 76, 2022. DOI: 10.1145/3447582.
    [30]
    M. Zhang, H. Q. Li, S. R. Pan, X. J. Chang, C. Zhou, Z. Y. Ge, S. Su. One-shot neural architecture search: Maximising diversity to overcome catastrophic forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 2921–2935, 2021. DOI: 10.1109/TPAMI.2020.3035351.
    [31]
    S. Han, J. Pool, J. Tran, W. J. Dally. Learning both weights and connections for efficient neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 1135–1143, 2015.
    [32]
    C. Gamanayake, L. Jayasinghe, B. K. K. Ng, C. Yuen. Cluster pruning: An efficient filter pruning method for edge AI vision applications. IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 4, pp. 802–816, 2020. DOI: 10.1109/JSTSP.2020.2971418.
    [33]
    G. L. Li, X. Ma, X. Y. Wang, L. Liu, J. L. Xue, X. B. Feng. Fusion-catalyzed pruning for optimizing deep learning on intelligent edge devices. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 3614–3626, 2020. DOI: 10.1109/TCAD.2020.3013050.
    [34]
    G. L. Li, X. Ma, X. Y. Wang, H. S. Yue, J. S. Li, L. Liu, X. B. Feng, J. L. Xue. Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning. Journal of Systems Architecture, vol. 124, Article number 102431, 2022. DOI: 10.1016/j.sysarc.2022.102431.
    [35]
    Z. Liu, J. G. Li, Z. Q. Shen, G. Huang, S. M. Yan, C. S. Zhang. Learning efficient convolutional networks through network slimming. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2755–2763, 2017. DOI: 10.1109/ICCV.2017.298.
    [36]
    Y. H. He, X. Y. Zhang, J. Sun. Channel pruning for accelerating very deep neural networks. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 1398–1406, 2017. DOI: 10.1109/ICCV.2017.155.
    [37]
    J. B. Ye, X. Lu, Z. Lin, J. Z. Wang. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [38]
    J. Liu, B. H. Zhuang, Z. W. Zhuang, Y. Guo, J. Z. Huang, J. H. Zhu, M. K. Tan. Discrimination-aware network pruning for deep model compression. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 4035–4051, 2022. DOI: 10.1109/TPAMI.2021.3066410.
    [39]
    Z. G. Li, G. Yuan, W. Niu, P. Zhao, Y. Y. Li, Y. X. Cai, X. Shen, Z. Zhan, Z. L. Kong, Q. Jin, Z. Y. Chen, S. J. Liu, K. Y. Yang, B. Ren, Y. Z. Wang, X. Lin. NPAS: A compiler-aware framework of unified network pruning and architecture search for beyond real-time mobile acceleration. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 14250–14261, 2021. DOI: 10.1109/CVPR46437.2021.01403.
    [40]
    X. H. Ding, T. X. Hao, J. C. Tan, J. Liu, J. G. Han, Y. C. Guo, G. G. Ding. ResRep: Lossless CNN pruning via decoupling remembering and forgetting. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 4490–4500, 2021. DOI: 10.1109/ICCV48922.2021.00447.
    [41]
    G. Hinton, O. Vinyals, J. Dean. Distilling the knowledge in a neural network, [Online], Available: http://arxiv.org/abs/1503.02531, 2015.
    [42]
    A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, Y. Bengio. FitNets: Hints for thin deep nets. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [43]
    Z. Zhang, G. H. Ning, Z. H. He. Knowledge projection for deep neural networks, [Online], Available: http://arxiv.org/abs/1710.09505, 2017.
    [44]
    J. Yim, D. Joo, J. Bae, J. Kim. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 7130–7138, 2017. DOI: 10.1109/CVPR.2017.754.
    [45]
    H. Wang, H. B. Zhao, X. Li, X. Tan. Progressive blockwise knowledge distillation for neural network acceleration. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, AAAI Press, Stockholm, Sweden, pp. 2769–2775, 2018.
    [46]
    N. Passalis, A. Tefas. Learning deep representations with probabilistic knowledge transfer. In Proceedings of the European Conference on Computer Vision, Springer, Munich, Germany, pp. 283–299, 2018. DOI: 10.1007/978-3-030-01252-6_17.
    [47]
    N. Lee, T. Ajanthan, P. H. S. Torr. Snip: Single-shot network pruning based on connection sensitivity. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [48]
    O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: 10.1007/s11263-015-0816-y.
    [49]
    H. Li, A. Kadav, I. Durdanovic, H. Samet, H. P. Graf. Pruning filters for efficient convnets. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [50]
    B. O. Ayinde, J. M. Zurada. Building efficient ConvNets using redundant feature pruning, [Online], Available: http://arxiv.org/abs/1802.07653, 2018.
    [51]
    M. B. Lin, R. R. Ji, Y. Wang, Y. C. Zhang, B. C. Zhang, Y. H. Tian, L. Shao. HRank: Filter pruning using high-rank feature map. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1526–1535, 2020. DOI: 10.1109/CVPR42600.2020.00160.
    [52]
    B. L. Li, B. W. Wu, J. Su, G. R. Wang. EagleEye: Fast sub-net evaluation for efficient neural network pruning. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 639–654, 2020. DOI: 10.1007/978-3-030-58536-5_38.
    [53]
    X. H. Ding, G. G. Ding, Y. C. Guo, J. G. Han, C. G. Yan. Approximated oracle filter pruning for destructive CNN width optimization. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 1607–1616, 2019.
    [54]
    X. H. Ding, G. G. Ding, Y. C. Guo, J. G. Han. Centripetal SGD for pruning very deep convolutional networks with complicated structure. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4938–4948, 2019. DOI: 10.1109/CVPR.2019.00508.
    [55]
    M. X. Tan, B. Chen, R. M. Pang, V. Vasudevan, M. Sandler, A. Howard, Q. V. Le. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2815–2823, 2019. DOI: 10.1109/CVPR.2019.00293.
    [56]
    X. T. Gao, Y. R. Zhao, \L. Dudziak, R. D. Mullins, C. Z. Xu. Dynamic channel pruning: Feature boosting and suppression. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [57]
    J. H. Luo, J. X. Wu, W. Y. Lin. ThiNet: A filter level pruning method for deep neural network compression. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 5068–5076, 2017. DOI: 10.1109/ICCV.2017.541.
    [58]
    M. X. Tan, Q. V. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, PMLR, Long Beach, USA, pp. 6105–6114, 2019.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(6)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (122) PDF downloads(3) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return