Representation learning is used to summarize essential features of high-dimensional data and turn them into lower-dimensional representations with desirable properties. A popular method for this is the heuristic approach, which fits a neural network that maps from the high dimensional data to a set of labels, taking the top layer of the neural network as the representation of the inputs.
However, such heuristic approaches often end up capturing spurious features that do not transfer well; or finding entangled dimensions that are uninterpretable. And while non-spuriousness or disentanglement are natural desiderata of representations, they are difficult to evaluate and optimize over algorithmically.
To address this issue, a new study by UC Berkeley researchers Yixian Wang and Michael I. Jordon takes a causal perspective on representation learning, which enables the formalization of non-spuriousness, efficiency and disentanglement representation learning desiderata using causal notions.
The work focuses on two sets of desiderata in representation learning: 1) efficiency and non-spuriousness in supervised representation learning; 2) disentanglement in unsupervised representation learning.
In the supervised setting, the main idea is to treat representation learning as the capturing of features that are potential causes for a given label. Under this perspective, a non-spurious representation should capture features that are significant causes of the label representations, such that the representations can efficiently capture non-spurious data features.
To obtain calculable metrics for these desiderata under a supervised setting, the researchers developed causal identification strategies via their proposed CAUSAL-REP algorithmic framework, producing calculable efficiency and non-spuriousness metrics in the high-dimensional setting.
In the unsupervised setting, the team views disentangled representations as enabling the generation of new examples of objects by separating dimensions that correspond to different encoded features.
To obtain an operational measure of disentanglement, the team found that the absence of causal connections among features captured by different dimensions of the representation implies that their support must be independent in observational data, and that these observable implications can therefore exist in support of the representation. Enlightened by this discovery, they leveraged the connection between disentanglement and independent support to develop an independence-of-support score (IOSS) to serve as an unsupervised disentanglement metric.
Through its novel exploration of non-spuriousness/efficiency and disentanglement, the study shows that representation learning desiderata can be formalized using causal notions. Moreover, the workflow: desiderata → causal definitions → observable implications → metrics and algorithms can also lead to practical evaluation metrics and representation learning algorithms.
The paper Desiderata for Representation Learning: A Causal Perspective is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.