Lin Song, Jin-Fu Yang, Qing-Zhen Shang, Ming-Ai Li. Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism. Machine Intelligence Research, vol. 19, no. 3, pp.247-256, 2022. https://doi.org/10.1007/s11633-022-1327-2
Citation: Lin Song, Jin-Fu Yang, Qing-Zhen Shang, Ming-Ai Li. Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism. Machine Intelligence Research, vol. 19, no. 3, pp.247-256, 2022. https://doi.org/10.1007/s11633-022-1327-2

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

doi: 10.1007/s11633-022-1327-2
More Information
  • Author Bio:

    Lin Song received the B. Sc. degree in measurement and control technology and instrumentation from YanTai University, China in 2019. She is now a master student with Department of Control Science and Engineering, Beijing University of Technology, China. Her research interests include deep learning and computer vision. E-mail: songlin@emails.bjut.edu.cn (Corresponding author) ORCID iD: 0000-0003-2289-7325

    Jin-Fu Yang received the Ph.D. degree in pattern recognition and intelligent systems from National Laboratory of Pattern Recognition, Chinese Academy of Sciences, China in 2006. He is now a professor with Faculty of Information Technology and Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China. His research interests include pattern recognition, computer vision and robot navigation. E-mail: jfyang@bjut.edu.cn

    Qing-Zhen Shang received the M. Eng. degree in mathematics from Hebei University, China in 2017. She is a Ph. D. degree candidate at Department of Control Science and Engineering, Beijing University of Technology, China. Her research interests include deep learning and computer vision. E-mail: shangqingzhen@emails.bjut.edu.cn

    Ming-Ai Li received the Ph. D. degree from Beijing University of Technology, China in 2006. She is now a professor with Faculty of Information Technology and Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China. Her research interests include brain-computer interface, intelligent control, pattern recognition and implementation of autonomous learning control technology for flexible two-wheeled upstanding robots. E-mail: limingai@bjut.edu.cn

  • Received Date: 2021-08-30
  • Accepted Date: 2022-03-01
  • Publish Online: 2022-03-29
  • Publish Date: 2022-05-25
  • Face detection has achieved tremendous strides thanks to convolutional neural networks. However, dense face detection remains an open challenge due to large face scale variation, tiny faces, and serious occlusion. This paper presents a robust, dense face detector using global context and visual attention mechanisms which can significantly improve detection accuracy. Specifically, a global context fusion module with top-down feedback is proposed to improve the ability to identify tiny faces. Moreover, a visual attention mechanism is employed to solve the problem of occlusion. Experimental results on the public face datasets WIDER FACE and FDDB demonstrate the effectiveness of the proposed method.

     

  • loading
  • [1]
    F. Schroff, D. Kalenichenko, J. Philbin. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 815−823, 2015. DOI: 10.1109/CVPR.2015.7298682.
    [2]
    F. F. Zhang, T. Z. Zhang, Q. R. Mao, C. S. Xu. Joint pose and expression modeling for facial expression recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3359−3368, 2018. DOI: 10.1109/CVPR.2018.00354.
    [3]
    K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: 10.1109/LSP.2016.2603342.
    [4]
    P. Y. Hu, D. Ramanan. Finding tiny faces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1522−1530, 2017. DOI: 10.1109/CVPR.2017.166.
    [5]
    M. Najibi, P. Samangouei, R. Chellappa, L. S. Davis. SSH: Single stage headless face detector. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 4885−4894, 2017. DOI: 10.1109/ICCV.2017.522.
    [6]
    J. K. Deng, J. Guo, Y. X. Zhou, J. K. Yu, I. Kotsia, S. Zafeiriou. RetinaFace: Single-stage dense face localisation in the wild. [Online], Available: https://arxiv.org/abs/1905.00641, 2019.
    [7]
    T. Y. Lin, P. Goyal, R. Girshick, K. M. He, P. Dollár. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318–327, 2020. DOI: 10.1109/TPAMI.2018.2858826.
    [8]
    V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010-009, University of Massachusetts, USA, 2010.
    [9]
    S. Yang, P. Luo, C. C. Loy, X. O. Tang. WIDER FACE: A face detection benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 5525−5533, 2016. DOI: 10.1109/CVPR.2016.596.
    [10]
    C. C. Zhu, Y. T. Zheng, K. Luu, M. Savvides. CMS-RCNN: Contextual multi-scale region-based CNN for unconstrained face detection. Deep Learning for Biometrics, B. Bhanu, A. Kumar, Eds., Cham, Germany: Springer, pp. 57−79, 2017. DOI: 10.1007/978-3-319-61657-5_3.
    [11]
    S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 91−99, 2015. DOI: 10.5555/2969239.2969250.
    [12]
    T. Xu, D. K. Du, Z. Q. He, J. T. Liu. PyramidBox: A context-assisted single shot face detector. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 812−828, 2018. DOI: 10.1007/978-3-030-01240-3_49.
    [13]
    S. F. Zhang, X. Y. Zhu, Z. Lei, H. L. Shi, X. B. Wang, S. Z. Li. S.3FD: Single shot scale-invariant face detector. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 192−201, 2017. DOI: 10.1109/ICCV.2017.30.
    [14]
    M. Jaderberg, K. Simonyan, A. Zisserman, K. Kavukcuoglu. Spatial transformer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2017−2025, 2015. DOI: 10.5555/2969442.2969465.
    [15]
    J. Hu, L. Shen, S. Albanie, G. Sun, E. H. Wu. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 2011–2023, 2020. DOI: 10.1109/TPAMI.2019.2913372.
    [16]
    S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3−19, 2018. DOI: 10.1007/978-3-030-01234-2_1.
    [17]
    J. F. Wang, Y. Yuan, G. Yu. Face attention network: An effective face detector for the occluded faces. [Online],Available: https://arxiv.org/abs/1711.07246, 2017.
    [18]
    A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.04861, 2017.
    [19]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770−778, 2016. DOI: 10.1109/CVPR.2016.90.
    [20]
    S. S. Farfade, M. J. Saberian, L. J. Li. Multi-view face detection using deep convolutional neural networks. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 643−650, 2015. DOI: 10.1145/2671188.2749408.
    [21]
    H. X. Li, Z. Lin, X. H. Shen, J. Brandt, G. Hua. A convolutional neural network cascade for face detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 5325−5334, 2015. DOI: 10.1109/CVPR.2015.7299170.
    [22]
    R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 1440−1448, 2015. DOI: 10.1109/ICCV.2015.169.
    [23]
    R. Ranjan, V. M. Patel, R. Chellappa. A deep pyramid deformable part model for face detection. In Proceedings of the 7th IEEE International Conference on Biometrics Theory, Applications and Systems, IEEE, Arlington, USA, pp. 1−8, 2015. DOI: 10.1109/BTAS.2015.7358755.
    [24]
    J. H. Yu, Y. N. Jiang, Z. Y. Wang, Z. M. Cao, T. Huang. UnitBox: An advanced object detection network. In Proceedings of the 24th ACM International Conference on Multimedia, ACM, Amsterdam, The Netherlands, pp. 516−520, 2016. DOI: 10.1145/2964284.2967274.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(3)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (281) PDF downloads(47) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return