Volume 19, Number 5, 2022

Display Method:
Editorial for Special Issue on Brain-inspired Machine Learning
Zhaoxiang Zhang, Bin Luo, Jin Tang, Shan Yu, Amir Hussain
2022, vol. 19, no. 5, pp. 347-349, doi: 10.1007/s11633-022-1376-6
Neural Decoding of Visual Information Across Different Neural Recording Modalities and Approaches
Yi-Jun Zhang, Zhao-Fei Yu, Jian. K. Liu, Tie-Jun Huang
2022, vol. 19, no. 5, pp. 350-365, doi: 10.1007/s11633-022-1335-2
Vision plays a peculiar role in intelligence. Visual information, forming a large part of the sensory information, is fed into the human brain to formulate various types of cognition and behaviours that make humans become intelligent agents. Recent advances have led to the development of brain-inspired algorithms and models for machine vision. One of the key components of these methods is the utilization of the computational principles underlying biological neurons. Additionally, advanced experimental neuroscience techniques have generated different types of neural signals that carry essential visual information. Thus, there is a high demand for mapping out functional models for reading out visual information from neural signals. Here, we briefly review recent progress on this issue with a focus on how machine learning techniques can help in the development of models for contending various types of neural signals, from fine-scale neural spikes and single-cell calcium imaging to coarse-scale electroencephalography (EEG) and functional magnetic resonance imaging recordings of brain signals.
Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies
Yang Wu, Ding-Heng Wang, Xiao-Tong Lu, Fan Yang, Man Yao, Wei-Sheng Dong, Jian-Bo Shi, Guo-Qi Li
2022, vol. 19, no. 5, pp. 366-411, doi: 10.1007/s11633-022-1340-5
Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs, particularly the modern deep neural networks (DNNs) and some brain-inspired methodologies, have largely boosted the recognition performance on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Although recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this survey, we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches, including efficient network compression and dynamic brain-inspired networks. We investigate not only from the model but also from the data point of view (which is not the case in existing surveys) and focus on four typical data types (images, video, points, and events). This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems.
Towards a New Paradigm for Brain-inspired Computer Vision
Xiao-Long Zou, Tie-Jun Huang, Si Wu
2022, vol. 19, no. 5, pp. 412-424, doi: 10.1007/s11633-022-1370-z
Brain-inspired computer vision aims to learn from biological systems to develop advanced image processing techniques. However, its progress so far is not impressing. We recognize that a main obstacle comes from that the current paradigm for brain-inspired computer vision has not captured the fundamental nature of biological vision, i.e., the biological vision is targeted for processing spatio-temporal patterns. Recently, a new paradigm for developing brain-inspired computer vision is emerging, which emphasizes on the spatio-temporal nature of visual signals and the brain-inspired models for processing this type of data. In this paper, we review some recent primary works towards this new paradigm, including the development of spike cameras which acquire spiking signals directly from visual scenes, and the development of computational models learned from neural systems that are specialized to process spatio-temporal patterns, including models for object detection, tracking, and recognition. We also discuss about the future directions to improve the paradigm.
Research Article
Clause-level Relationship-aware Math Word Problems Solver
Chang-Yang Wu, Xin Lin, Zhen-Ya Huang, Yu Yin, Jia-Yu Liu, Qi Liu, Gang Zhou
2022, vol. 19, no. 5, pp. 425-438, doi: 10.1007/s11633-022-1351-2
Automatically solving math word problems, which involves comprehension, cognition, and reasoning, is a crucial issue in artificial intelligence research. Existing math word problem solvers mainly work on word-level relationship extraction and the generation of expression solutions while lacking consideration of the clause-level relationship. To this end, inspired by the theory of two levels of process in comprehension, we propose a novel clause-level relationship-aware math solver (CLRSolver) to mimic the process of human comprehension from lower level to higher level. Specifically, in the lower-level processes, we split problems into clauses according to their natural division and learn their semantics. In the higher-level processes, following human′s multi-view understanding of clause-level relationships, we first apply a CNN-based module to learn the dependency relationships between clauses from word relevance in a local view. Then, we propose two novel relationship-aware mechanisms to learn dependency relationships from the clause semantics in a global view. Next, we enhance the representation of clauses based on the learned clause-level dependency relationships. In expression generation, we develop a tree-based decoder to generate the mathematical expression. We conduct extensive experiments on two datasets, where the results demonstrate the superiority of our framework.
Exploring the Brain-like Properties of Deep Neural Networks: A Neural Encoding Perspective
Qiongyi Zhou, Changde Du, Huiguang He
2022, vol. 19, no. 5, pp. 439-455, doi: 10.1007/s11633-022-1348-x
Nowadays, deep neural networks (DNNs) have been equipped with powerful representation capabilities. The deep convolutional neural networks (CNNs) that draw inspiration from the visual processing mechanism of the primate early visual cortex have outperformed humans on object categorization and have been found to possess many brain-like properties. Recently, vision transformers (ViTs) have been striking paradigms of DNNs and have achieved remarkable improvements on many vision tasks compared to CNNs. It is natural to ask how the brain-like properties of ViTs are. Beyond the model paradigm, we are also interested in the effects of factors, such as model size, multimodality, and temporality, on the ability of networks to model the human visual pathway, especially when considering that existing research has been limited to CNNs. In this paper, we systematically evaluate the brain-like properties of 30 kinds of computer vision models varying from CNNs and ViTs to their hybrids from the perspective of explaining brain activities of the human visual cortex triggered by dynamic stimuli. Experiments on two neural datasets demonstrate that neither CNN nor transformer is the optimal model paradigm for modelling the human visual pathway. ViTs reveal hierarchical correspondences to the visual pathway as CNNs do. Moreover, we find that multi-modal and temporal networks can better explain the neural activities of large parts of the visual cortex, whereas a larger model size is not a sufficient condition for bridging the gap between human vision and artificial networks. Our study sheds light on the design principles for more brain-like networks. The code is available at https://github.com/QYiZhou/LWNeuralEncoding.
Denoised Internal Models: A Brain-inspired Autoencoder Against Adversarial Attacks
Kai-Yuan Liu, Xing-Yu Li, Yu-Rui Lai, Hang Su, Jia-Chen Wang, Chun-Xu Guo, Hong Xie, Ji-Song Guan, Yi Zhou
2022, vol. 19, no. 5, pp. 456-471, doi: 10.1007/s11633-022-1375-7
Despite its great success, deep learning severely suffers from robustness; i.e., deep neural networks are very vulnerable to adversarial attacks, even the simplest ones. Inspired by recent advances in brain science, we propose the denoised internal models (DIM), a novel generative autoencoder-based model to tackle this challenge. Simulating the pipeline in the human brain for visual signal processing, DIM adopts a two-stage approach. In the first stage, DIM uses a denoiser to reduce the noise and the dimensions of inputs, reflecting the information pre-processing in the thalamus. Inspired by the sparse coding of memory-related traces in the primary visual cortex, the second stage produces a set of internal models, one for each category. We evaluate DIM over 42 adversarial attacks, showing that DIM effectively defenses against all the attacks and outperforms the SOTA on the overall robustness on the MNIST (Modified National Institute of Standards and Technology) dataset.
EEG-based Emotion Recognition Using Multiple Kernel Learning
Qian Cai, Guo-Chong Cui, Hai-Xian Wang
2022, vol. 19, no. 5, pp. 472-484, doi: 10.1007/s11633-022-1352-1
Emotion recognition based on electroencephalography (EEG) has a wide range of applications and has great potential value, so it has received increasing attention from academia and industry in recent years. Meanwhile, multiple kernel learning (MKL) has also been favored by researchers for its data-driven convenience and high accuracy. However, there is little research on MKL in EEG-based emotion recognition. Therefore, this paper is dedicated to exploring the application of MKL methods in the field of EEG emotion recognition and promoting the application of MKL methods in EEG emotion recognition. Thus, we proposed a support vector machine (SVM) classifier based on the MKL algorithm EasyMKL to investigate the feasibility of MKL algorithms in EEG-based emotion recognition problems. We designed two data partition methods, random division to verify the validity of the MKL method and sequential division to simulate practical applications. Then, tri-categorization experiments were performed for neutral, negative and positive emotions based on a commonly used dataset, the Shanghai Jiao Tong University emotional EEG dataset (SEED). The average classification accuracies for random division and sequential division were 92.25% and 74.37%, respectively, which shows better classification performance than the traditional single kernel SVM. The final results show that the MKL method is obviously effective, and the application of MKL in EEG emotion recognition is worthy of further study. Through the analysis of the experimental results, we discovered that the simple mathematical operations of the features on the symmetrical electrodes could not effectively integrate the spatial information of the EEG signals to obtain better performance. It is also confirmed that higher frequency band information is more correlated with emotional state and contributes more to emotion recognition. In summary, this paper explores research on MKL methods in the field of EEG emotion recognition and provides a new way of thinking for EEG-based emotion recognition research.