Deep neural networks (DNNs) have achieved great success across a wide variety of tasks, including fundamental unsupervised tasks such as data clustering. Like conventional clustering methods, deep clustering is also parametric, meaning it requires a predefined cluster or class number. Choosing the optimal cluster number can however become very computationally expensive due to the relative complexity of DNN architectures.
In the new paper DeepDPM: Deep Clustering With an Unknown Number of Clusters, a research team from the Ben-Gurion University of the Negev presents DeepDPM, an effective deep nonparametric approach that removes the need to predefine the number of clusters in clustering tasks and can infer it instead. The proposed method’s performance is comparable to leading parametric methods and surpasses existing nonparametric methods with both classical and deep approaches, achieving SOTA results.
DeepDPM is an inference algorithm that enables the number of clusters to be inferred and changed during training. It comprises two parts: 1) A clustering net that generates soft cluster assignments for each input data point, and 2) Subclustering nets that take the previously generated soft cluster assignments as inputs and generate soft subcluster assignments, which will later be used to support split and merge decisions to dynamically adapt to and change the number of clusters.
The team also introduces a new loss function motivated by the expectation–maximization algorithm in Bayesian Gaussian mixture models (EM-GMM) to make the DeepDPM more robust and efficient.
In their evaluations, the team compared DeepDPM to both classical parametric (e.g. K-means and GMM), classical nonparametric (e.g. DBSCAN and moVB), and deep nonparametric (e.g. DCC and AdapVAE) methods on widely used image and text datasets at varying scales.
The evaluations show that DeepDPM almost uniformly achieves the best performance across all datasets when comparing all methods, reaching SOTA levels. The researchers also note that DeepDPM is robust to both class imbalance and initial cluster value, and, as it eliminates the need to repeatedly train deep parametric methods for model selection, it can significantly reduce resource usage.
The team hopes their study can motivate others working in deep clustering to explore more such nonparametric approaches, and suggests adapting DeepDPM to streaming data as a possible future research avenue.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.