Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs, particularly the modern deep neural networks (DNNs) and some brain-inspired methodologies, have largely boosted the recognition performance on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Although recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. This survey by the research teams of Dr. Yang Wu from Tencent, Dr. Guo-Qi Li from Institute of Automation, Chinese Academy of Sciences, Dr. Ding-Heng Wang from Xi’an Jiaotong University, Prof. Wei-Sheng Dong from Xidian University, and Prof. Jian-Bo Shi from University of Pennsylvania, provides the first survey on efficient visual recognition algorithms with DNNs, particularly brain-inspired methodologies, including event data and SNNs. It targets a systematic overview of recent advances and trends from various aspects, with major types of visual data, their various recognition models, network compression algorithms, and efficient inference. Full text is open accessed in the fifth issue of Machine Intelligence Research.

 


Deep neural networks (DNNs) have achieved great success in many visual recognition tasks. They have largely improved the performance of long-lasting problems such as handwritten digit recognition, face recognition, image categorization, etc. They are also enabling the exploration of new application boundaries, including studies on image and video captioning, body pose estimation, and many others. However, such successes are generally conditioned on huge amounts of high-quality hand labelled training data and the recently greatly advanced computational resources.

 

Obviously, these two conditions are usually too expensive to be satisfied in most cost-sensitive applications. Even when people do have enough high-quality training data, due to the massive efforts of many annotators, it is usually a great challenge to figure out how to train an effective model with limited resources and within an acceptable time.

 

Assuming that somehow the model can be properly trained (no matter how much effort it takes), it is still not easy to have the model properly deployed for real applications on the end users′ side, as the run-time inference has to fit the available or affordable resources, and the running speed has to meet the actual needs that can be real-time or even more than that. Therefore, besides accuracy, which is usually the biggest concern in academia, efficiency is another important issue and, in most cases, an indispensable demand for real applications.

 

Though most of the research on using DNNs for visual recognition tasks focuses on accuracy, there are still many encouraging progresses on the efficiency side, especially in the recent few years. For example, some survey papers have been published on efficiency issues for DNNs, as detailed in the following Section 1.1.

 

However, none of them pays a major attention to visual recognition tasks, especially lacking coverage of special efforts to efficiently deal with visual data, which has its own properties, and the so-called third generation of efficient neural network models, which are inspired by human brains, i.e., spiking neural networks (SNNs), are also lacking in discussions.

 

In practice, efficient visual recognition has to be a systematic solution that takes into account not only compact/compressed networks, efficient dynamic inference, and hardware acceleration, but also proper handling of visual data, which may be of various types (such as images, videos, points, and brain-inspired events) with quite different properties.

 

Therefore, this survey provides the first survey on efficient visual recognition algorithms with DNNs, particularly brain-inspired methodologies, including event data and SNNs. It targets a systematic overview of recent advances and trends from various aspects, based on the authors’ expertise and experiences with major types of visual data, their various recognition models, network compression algorithms, and efficient inference.

 

 

Download full text

Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

Yang Wu, Ding-Heng Wang, Xiao-Tong Lu, Fan Yang, Man Yao, Wei-Sheng Dong, Jian-Bo Shi, Guo-Qi Li

https://link.springer.com/article/10.1007/s11633-022-1340-5

https://www.mi-research.net/en/article/doi/10.1007/s11633-022-1340-5

  

  • Share:
Release Date: 2022-10-12 Visited: