Citation: | Wenjun Hui, Guanghua Gu, Bo Wang. Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization. Machine Intelligence Research, vol. 20, no. 6, pp.923-936, 2023. https://doi.org/10.1007/s11633-022-1368-6 |
[1] |
I. B. Senkyire, Z. Liu. Supervised and semi-supervised methods for abdominal organ segmentation: A review. International Journal of Automation and Computing, vol. 18, no. 6, pp. 887–914, 2021. DOI: 10.1007/s11633-021-1313-0.
|
[2] |
X. Y. Zhang, H. C. Shi, C. S. Li, L. X. Duan. TwinNet: Twin structured knowledge transfer network for weakly supervised action localization. Machine Intelligence Research, vol. 19, no. 3, pp. 227–246, 2022. DOI: 10.1007/s11633-022-1333-4.
|
[3] |
D. W. Zhang, J. W. Han, G. Cheng, M. H. Yang. Weakly supervised object localization and detection: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5866–5885, 2021. DOI: 10.1109/TPAMI.2021.3074313.
|
[4] |
X. L. Zhang, Y. C. Wei, J. S. Feng, Y. Yang, T. Huang. Adversarial complementary learning for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1325–1334, 2018. DOI: 10.1109/CVPR.2018.00144.
|
[5] |
X. L. Zhang, Y. C. Wei, G. L. Kang, Y. Yang, T. Huang. Self-produced guidance for weakly-supervised object localization. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 610–625, 2018. DOI: 10.1007/978-3-030-01258-8_37.
|
[6] |
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra. Grad-CAM: Visual explanations from deep networks via gradient- based localization. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 618–626, 2017. DOI: 10.1109/ICCV.2017.74.
|
[7] |
C. C. Tan, G. H. Gu, T. Ruan, S. K. Wei, Y. Zhao. Dual-gradients localization framework for weakly supervised object localization. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, USA, pp. 1976–1984, 2020. DOI: 10.1145/3394171.3413622.
|
[8] |
W. J. Hui, C. C. Tan, G. H. Gu, Y. Zhao. Gradient-based refined class activation map for weakly supervised object localization. Pattern Recognition, vol. 128, Article number 108664, 2022. DOI: 10.1016/j.patcog.2022.108664.
|
[9] |
C. Y. Li, R. M. Cong, S. Kwong, J. H. Hou, H. Z. Fu, G. P. Zhu, D. W. Zhang, Q. M. Huang. ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection. IEEE Transactions on Cybernetics, vol. 51, no. 1, pp. 88–100, 2021. DOI: 10.1109/TCYB.2020.2969255.
|
[10] |
Y. W. Pang, J. L. Cao, X. L. Li. Learning sampling distributions for efficient object detection. IEEE Transactions on Cybernetics, vol. 47, no. 1, pp. 117–129, 2017. DOI: 10.1109/TCYB.2015.2508603.
|
[11] |
J. Z. Peng, H. Kervadec, J. Dolz, I. Ben Ayed, M. Pedersoli, C. Desrosiers. Discretely-constrained deep network for weakly supervised segmentation. Neural Networks, vol. 130, pp. 297–308, 2020. DOI: 10.1016/j.neunet.2020.07.011.
|
[12] |
R. Girshick, J. Donahue, T. Darrell, J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 580–587, 2014. DOI: 10.1109/CVPR.2014.81.
|
[13] |
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904–1916, 2015. DOI: 10.1109/TPAMI.2015.2389824.
|
[14] |
R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1440–1448, 2015. DOI: 10.1109/ICCV.2015.169.
|
[15] |
S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 91–99, 2015.
|
[16] |
J. Redmon, S. Divvala, R. Girshick, A. Farhadi. You only look once: Unified, real-time object detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 779–788, 2016. DOI: 10.1109/CVPR.2016.91.
|
[17] |
J. Redmon, A. Farhadi. YOLO9000: Better, faster, stronger. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6517–65251, 2017. DOI: 10.1109/CVPR.2017.690.
|
[18] |
J. Redmon, A. Farhadi. YOLOv3: An incremental improvement, [Online], Available: https://arxiv.org/abs/1804.02767, 2018.
|
[19] |
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. SSD: Single shot multibox detector. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 21–37, 2016. DOI: 10.1007/978-3-319-46448-0_2.
|
[20] |
S. Bonechi, M. Bianchini, F. Scarselli, P. Andreini. Weak supervision for generating pixel-level annotations in scene text segmentation. Pattern Recognition Letters, vol. 138, pp. 1–7, 2020. DOI: 10.1016/j.patrec.2020.06.023.
|
[21] |
F. D. Sun, W. H. Li. Saliency guided deep network for weakly-supervised image segmentation. Pattern Recognition Letters, vol. 120, pp. 62–68, 2019. DOI: 10.1016/j.patrec.2019.01.009.
|
[22] |
X. L. Zhang, Y. C. Wei, Y. Yang, F. Wu. Rethinking localization map: Towards accurate object perception with self-enhancement maps, [Online], Available: https://arxiv.org/abs/2006.05220, 2020.
|
[23] |
W. Bae, J. Noh, G. Kim. Rethinking class activation mapping for weakly supervised object localization. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 618–634, 2020. DOI: 10.1007/978-3-030-58555-6_37.
|
[24] |
B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning deep features for discriminative localization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2921–2929, 2016. DOI: 10.1109/CVPR.2016.319.
|
[25] |
K. K. Singh, Y. J. Lee. Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3544–3553, 2017. DOI: 10.1109/ICCV.2017.381.
|
[26] |
J. Choe, H. Shim. Attention-based dropout layer for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2214–2223, 2019. DOI: 10.1109/CVPR.2019.00232.
|
[27] |
J. J. Mai, M. Yang, W. F. Luo. Erasing integrated learning: A simple yet effective approach for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8763–8772, 2020. DOI: 10.1109/CVPR42600.2020.00879.
|
[28] |
H. L. Xue, C. Liu, F. Wan, J. B. Jiao, X. Y. Ji, Q. X. Ye. DANet: Divergent activation for weakly supervised object localization. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 6588–6597, 2019. DOI: 10.1109/ICCV.2019.00669.
|
[29] |
X. L. Zhang, Y. C. Wei, Y. Yang. Inter-image communication for weakly supervised localization. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 271–287, 2020. DOI: 10.1007/978-3-030-58529-7_17.
|
[30] |
C. L. Zhang, Y. H. Cao, J. X. Wu. Rethinking the route towards weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 13457–13466, 2020. DOI: 10.1109/CVPR42600.2020.01347.
|
[31] |
J. Wei, Q. Wang, Z. Li, S. Wang, S. K. Zhou, S. G. Cui. Shallow feature matters for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 5989–5997, 2021. DOI: 10.1109/CVPR46437.2021.00593.
|
[32] |
W. Z. Lu, X. Jia, W. C. Xie, L. L. Shen, Y. C. Zhou, J. M. Duan. Geometry constrained weakly supervised object localization. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 481–496, 2020. DOI: 10.1007/978-3-030-58574-7_29.
|
[33] |
S. Yang, Y. Kim, Y. Kim, C. Kim. Combinational class activation maps for weakly supervised object localization. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Snowmass, USA, pp. 2930–2938, 2020. DOI: 10.1109/WACV45572.2020.9093566.
|
[34] |
X. J. Pan, Y. G. Gao, Z. W. Lin, F. Tang, W. M. Dong, H. L. Yuan, F. Y. Huang, C. S. Xu. Unveiling the potential of structure preserving for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 11642–11651, 2021. DOI: 10.1109/CVPR46437.2021.01147.
|
[35] |
G. Y. Guo, J. W. Han, F. Wan, D. W. Zhang. Strengthen learning tolerance for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 7399–7408, 2021. DOI: 10.1109/CVPR46437.2021.00732.
|
[36] |
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. A. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: 10.1007/s11263-015-0816-y.
|
[37] |
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The caltech-UCSD birds-200-2011 dataset, [Online], Available: https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf, 2011.
|
[38] |
J. Choe, S. J. Oh, S. Lee, S. Chun, Z. Akata, H. Shim. Evaluating weakly supervised object localization methods right. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 3130–3139, 2020. DOI: 10.1109/CVPR42600.2020.00320.
|
[39] |
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition, [Online], Available: https://arxiv.org/abs/1409.1556, 2014.
|
[40] |
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2818–2826, 2016. DOI: 10.1109/CVPR.2016.308.
|
[41] |
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
|
[42] |
S. Yun, D. Han, S. Chun, S. J. Oh, Y. Yoo, J. Choe. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 6022–6031, 2019. DOI: 10.1109/ICCV.2019.00612.
|
[43] |
W. Gao, F. Wan, X. J. Pan, Z. L. Peng, Q. Tian, Z. J. Han, B. L. Zhou, Q. X. Ye. TS-CAM: Token semantic coupled attention map for weakly supervised object localization. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 2866–2875, 2021. DOI: 10.1109/ICCV48922.2021.00288.
|
[44] |
J. Kim, J. Choe, S. Yun, N. Kwak. Normalization matters in weakly supervised object localization. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3407–3416, 2021. DOI: 10.1109/ICCV48922.2021.00341.
|