Technology

Bestowing Human Features on Machines: From Vision to Imagination

Summaries and recommendations of peer-reviewed papers that discuss various aspects of machine learning.

1. Deep Learning Models of the Retinal Response to Natural Scenes  (McIntosh et al. 2016)

Source: https://arxiv.org/abs/1702.01825

Summary:
In this article, convolutional neural networks (CNNs) were trained to predict the responses of retinal ganglion cells towards two kinds of visual stimuli, and evaluations of their performance were made. It turned out that compared to linear-nonlinear (LN) and generalized linear models (GLMs), CNNs were the best at approaching the activities of real retinal cells. Most importantly, CNNs also displayed several properties that closely mimic retinal mechanisms: a richer set of feature maps were required when responded to natural scenes than to white noises; activities of information processing were consistent with steps in real retinal cells; adjustments made to CNNs with latent noise and recurrent lateral connections could even further mimic the spiking variability and contrast adaption observed in retinal activities.

Recommending reason:
Currently, much of our understanding of the visual mechanisms comes from studies experimenting with artificial stimuli such as white noise and generalizing the model towards natural stimuli. However, the large difference between artificial and natural stimulus make the process of generalization opaque. The great performance shown by CNNs in approaching actual activities of retinal ganglion cells demonstrate the potential of the model to precisely capture the sensory circuit responses, and provide information about the inner structure and function of the circuit, which would greatly benefit both scientists working in computer vision and neuroscience.

2.  Multi-Region Neural Representation: A novel model for decoding visual stimuli in human brains (Yousefnezhad and Zhang 2016)

Source: https://arxiv.org/abs/1612.08392

Summary:
Multivariate Pattern (MVP) classification techniques employing task-based fMRI data sets have huge potential in decoding visual stimuli in brain activities. Most state-of-the-art methods utilize time series of fMRI signals and manually selected regions of interest (ROIs), which resulted in problems of noise and sparsity in experiment results. Authors of this article proposed a novel model that improves currently available MVP techniques by automatically detecting ROIs and analyzing the snapshot of brain images with maximum activity level. Efforts were also made towards visualization of analyzing results and decrease in noise.

Recommending reason:
To most neuroscientists without training in computer science, machine learning techniques such as MVP are not very easy to understand and implement. The huge amount of data required could be another problem, since most neuroimaging techniques are very expensive. The improvements suggested by this article could make interpretations of MVP classification results easier to relate to cognitive states. Different fMRI data sets could also be combined together through processes proposed to reduce the cost of brain studies needed. If the advancements could be realized, implementation of machine learning in the field of neuroscience would be further encouraged.

3. Deep driven fMRI decoding of visual categories (Svanera et al. 2017)

Source: https://arxiv.org/abs/1701.02133

Summary:
Authors in this research paper proposed a novel decoding model that link fMRI data obtained while watching videos to video features extracted using a faster R-CNN framework by means of Kernel Canonical Correlation Analysis. This model allows possible linear (or approximately linear) relationship between fMRI representation and the last layer of CNN (fc7). In such way, the discriminatory power of CNN could be utilized without the implementation of learning multiple levels of representations, which is lacking due to limited brain data known so far.

Recommending reason:
This article involves techniques discussed in the above two papers: CNNs and MVP Analysis, which are two hot topics in computer vision. The novelty of the model is that instead of working on the inner mechanism of visual pathway, which still needs much efforts to reveal, it directly utilizes the discriminatory function of the system and manages to set a relationship between fMRI data point and processed video framework, a significantly better way to visualize decoding of visual processes with fMRI to most scientists.

4.  Morphognosis: the shape of knowledge in space and time (Portegys 2017)

Source: https://arxiv.org/abs/1701.02272

Summary:
In this paper, the author introduced a model of morphognosis, or shape of knowledge. The basic structure of the model is a pyramid of event recordings, with x-axis on pace and y-axis on time. Thus the apex of the pyramid would be the most recent and nearby events. Implementation of morphognosis with artificial neural network in both food foraging and Pong game stimulations showed positive results.

Recommending reason:
Visual representation of knowledge is an interesting topic. Current understanding on how knowledge is formed and stored cannot enable us to intuitively interpret knowledge in a form. This model might be overly simplified: it just considers time and space factors which are objective. But knowledge could be subjective as well. Nevertheless, the positive results shown by implementations in food foraging and Pong game stimulations demonstrate the potential of this model.

5.  Converting Cascade-Correlation Neural Nets into Probabilistic Generative Models (Nobandegani and Shultz 2017)

Source:  https://arxiv.org/abs/1701.05004

Summary:
Based on observations of the human thinking process, authors of this paper worked on transforming Cascade-Correlation Neural Networks (CCNNs), a class of discriminative NNs that could successfully account for several psychological phenomena. Through a Markov Chain Monte Carlo (MCMC) method, which directs explorations to regions of high probability, CCNNs could be converted into probabilistic generative models and be able to generate samples that could exist with a high probability. Extensive stimulations have shown the efficacy of the transformation.

Recommending reason:
The work done in this article is to bestow machines the ability to imagine. It is amazing that while implementing sensory functions and mechanisms of human brain in computers, scientists have also been working on bestowing important human-like or organism-like features such as knowledge and imagination to machines. Whether the transformation done on CCNNs towards probabilistic generative models would produce realistic “ability to imagine” is not conclusive, since there are far more factors that contribute to imagination than just probabilities, but it is definitely a good place to start.

 

 


Analyst: Yuka Liu | Localized by Synced Global Team : Xiang Chen

 

 

 

1 comment on “Bestowing Human Features on Machines: From Vision to Imagination

  1. Dante Li

    Kind of anxious about IBM’s future.

Leave a Reply

%d bloggers like this: