Graph neural network (GNN) architectures for graph learning are connectionist models that capture the dependence of graphs via a message passing technique between graph nodes. Such message passing schemes however do not enable GNNs to capture topological structures in graphs, thus overlooking information that could be useful for multi-scale representations that capture the shape of complex structured and unstructured datasets.
In a new paper, a research team from ETH Zurich, SIB Swiss Institute of Bioinformatics and KU Leuven proposes Topological Graph Layer (TOGL), a novel GNN layer that is capable of leveraging the multi-scale topological information of input graphs.
TOGL’s layer is differential and capable of learning topological representations, making it more expressive than GNNs. Moreover, TOGL can be easily integrated into any type of GNN to increase expressivity in graph learning tasks by making it “topology-aware.”
The researchers employ persistent homology to calculate the topological features of structured datasets. These topological features are know to be highly characteristic, ensuring the success of topology-driven machine learning approaches. At the core of the approach is filtration, essentially a sequence of nested subgraphs that can capture topological representations.
The proposed TOGL layer takes a graph G = (V,E), along with a set of d-dimensional node attribute as inputs, where V represents the nodes and E represents the edges of the graph. The node attributes can be either node features of a dataset or hidden representations learned by a GNN. The approach employs a family of k vertex filtration functions to calculate a set of persistence diagrams, then uses an embedding function for embedding persistence diagrams onto a high-dimensional space that will be used to obtain the vertex representations. In this way, the resulting topological features become node features, enabling TOGL to integrate arbitrary GNNs.
The researchers compared their method against several baselines: Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), Gated-GCN, Graph Information Networks and the Weisfeiler–Lehman kernel (WL). They also generated two synthetic balanced 2-class datasets: Cycles and Necklaces.
The experiment results show that TOGL can easily distinguish the Cycles dataset, even with only a single GCN layer in addition to the TOGL layer. For the Necklaces data, TOGL performs well regardless of network depth.
TOGL performed on a par with GCN-4 and even achieved a slight improvement for DD and IMDB-BINARY, demonstrating its overall beneficial effect.
To evaluate TOGL performance on larger datasets, the team conducted experiments on the MNIST and CIFAR10 graph classification datasets and CLUSTER and PATTERN node classification datasets generated using Stochastic Block Models (SBM).
TOGL outperformed all GCN baselines run in their setup, confirming the proposed approach’s ability to extract topological information, improving the predictive performance and expressive power of GNNs.
The researchers note that topological information can sometimes lead to overfitting issues on smaller datasets, and suggest further studies could be done on regularisation strategies to address this. They also hypothesise that employing different filtrations and improved persistent homology algorithms that pair all topological features could prove beneficial.
The paper Topological Graph Neural Networks is on arXiv.
Author: Hecate He | Editor: Michael Sarazen