Vision plays a peculiar role in intelligence. Visual information, forming a large part of the sensory information, is fed into the human brain to formulate various types of cognition and behaviours that make humans become intelligent agents. Recent advances have led to the development of brain-inspired algorithms and models for machine vision. One of the key components of these methods is the utilization of the computational principles underlying biological neurons. Additionally, advanced experimental neuroscience techniques have generated different types of neural signals that carry essential visual information. Thus, there is a high demand for mapping out functional models for reading out visual information from neural signals. Here, we briefly review recent progress on this issue with a focus on how machine learning techniques can help in the development of models for contending various types of neural signals, from fine-scale neural spikes and single-cell calcium imaging to coarse-scale electroencephalography (EEG) and functional magnetic resonance imaging recordings of brain signals.
Every day, various types of sensory information from the external environment are transferred to the brain through different modalities and then processed to generate a series of coping behaviours. Among these perceptual modalities, vision is arguably the dominant contributor to the interactions between the external environment and the brain. Approximately 70 percent of human perception information is derived from vision, far more than the auditory system, tactile system, and other sensory systems combined. The visual system is the part of the central nervous system that is required for visual perception, processing, and interpreting visual information to build a representation of the visual environment. It consists of the eye, retina, fibers that conduct visual information to the thalamus, the superior colliculus, and parts of the cerebral cortex.
Today, researchers can collect neural signals using different recording modalities, e.g., spikes, electroencephalography (EEG), and functional magnetic resonance imaging (fMRI), from brain activity in different parts of the visual system, such as the retina, lateral geniculate nucleus (LGN), and primary visual cortex (V1) cortex, etc. Depending on the corresponding collecting devices, different recording modalities differ in their invasiveness, scale, and precision.
Neural coding is an important topic for understanding how the brain processes stimuli from the environment. The aim of neural decoding is to read out information embedded in various types of neural signals.
As for vision, understanding how neurons perceive and respond to rich natural visual information is a major topic of neural encoding, whereas, the goal of neural decoding of visual information is to restore the original stimulus from neural responses as much as possible, as shown in Fig. 1. It is also critical to the development of artificial vision used by brain-computer interfaces and virtual reality devices.
Much effort has been made to study the various mechanisms underlying neural decoding in the visual pathway in recent decades. These mechanisms can be roughly divided into three categories depending on the decoding type: 1) Visual stimulus classification, in which a specific stimulus is classified into the best-matched image set; 2) Visual stimulus identification, in which the stimulus is identified with a specific visual object; 3) Visual stimuli reconstruction, in which the corresponding visual stimulus is reconstructed in accordance with the resulting neural responses.
Most decoding approaches have depended on linear methods due to their interpretability and computational efficiency. Although linear decoding methods are capable of decoding spatially uniform white noise stimuli and the coarse structure of natural scene stimuli from neural responses, the recovery of the fine visual details of naturalistic images is difficult for these types of methods.
The most recent decoders utilized nonlinear methods for the fine decoding of complex visual stimuli. For instance, optimal Bayesian decoding was leveraged for white noise stimuli, but achieved limited generalizability to a large neural population. For natural scene image structures, key prior information was used to perform computationally expensive approximations to Bayesian inference. Some researchers have combined linear and nonlinear approaches to generate coarse reconstructions of natural stimuli through calcium imaging data. Additionally, many researchers have begun to successfully use deep learning techniques for visual neural decoding, leading to the great achievement in artificial vision.
Visual neural decoding is a significant issue that can help advance engineering applications such as brain–machine interfaces and a more holistic understanding of the brain in neuroscience. Considering the rapid developments of related techniques in visual neural decoding, there is a strong demand for a comprehensive and up-to date review in this field.
This review sorted out the research evolution in visual neural decoding. Various neural recording modalities are introduced in this review, especially for the emerging calcium imaging data. It summarized the advantages and disadvantages of different neural decoding methods. In addition, open resources, including public neural data and software toolkits, are also provided for the convenience of neural decoding research. Finally, it concluded with the open challenges and future directions for the outlook in this study. It aims to provide a review of neural decoding in visual systems that could serve as an inspiration to both neuroscience and multidisciplinary researchers looking to understand the state-of-the-art and current problems in neural decoding, especially regarding the development of artificial intelligence and brain-like vision systems.
Download full text：
Neural Decoding of Visual Information Across Different Neural Recording Modalities and Approaches