AI Machine Learning & Data Science Research

Contrastive Learning Advances Sleep Science: Superior Multi-Modal Model Enhances Disorder Detection

In a new paper SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals, a research team introduces SleepFM, the first attempt at developing a multi-modal contrastive learning (CL) approach for PSG analysis, outperforming baselines in tasks like demographic attribute prediction and sleep stage classification.

Sleep is a complex physiological process evaluated through various methods that record electrical brain activity, cardiac activity, and respiratory signals. Recent advancements in supervised deep learning have shown promise in automating sleep staging and diagnosing sleep disorders. However, many existing methods fail to fully utilize the extensive unlabeled physiological data available from diverse polysomnography (PSG) sensors.

In a new paper SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals, a research team from Stanford University and Technical University of Denmark introduces SleepFM, the first attempt at developing a multi-modal contrastive learning (CL) approach for PSG analysis, outperforming end-to-end trained convolutional neural networks (CNNs) in tasks like demographic attribute prediction and sleep stage classification.

SleepFM stands out in two significant ways. First, it employs self-supervised representation learning on a large sleep dataset, unlike most prior works that rely on supervised learning. Second, it is the first contrastive model to utilize a wide array of sleep modalities, including Brain Activity Signals (BAS), electrocardiogram (ECG) waveforms, and respiratory signals, encompassing 19 data channels across the brain, heart, and lungs.

The researchers curated a substantial polysomnography dataset from over 14,000 participants, totaling more than 100,000 hours of multi-modal sleep recordings collected at the Stanford Sleep Clinic between 1999 and 2020. They used contrastive learning (CL) as the foundational algorithm for representation learning during the pre-training stage.

Three 1D CNNs were used to generate separate embeddings from the BAS, ECG, and respiratory modalities, training them individually. The architecture of these embedding models is based on the EfficientNet design, beginning with atrous convolutions followed by multi-channel 1D convolutions. While the layer count matches the original EfficientNet design, the number of channels is significantly reduced to enhance model runtime efficiency and reduce complexity. After the initial atrous layers, the model uses convolutional layers with an inverted residual structure, maintaining input and output bottleneck layers with an intermediate expansion layer.

The team further explored two CL frameworks for learning joint representations across modalities: pairwise CL and leave-one-out CL. In pairwise CL, contrastive prediction tasks are constructed between all pairs of modalities, using a contrastive loss to promote agreement between positive pairs and discourage agreement between negative pairs. In leave-one-out CL, an embedding from one modality is used to identify the corresponding embeddings from the remaining modalities.

Results showed that the novel leave-one-out approach for contrastive learning significantly improved downstream task performance compared to representations from standard pairwise contrastive learning. A logistic regression model trained on SleepFM’s learned embeddings outperformed an end-to-end trained CNN in sleep stage classification and sleep-disordered breathing detection. Notably, the learned embeddings achieved 48% top-1 average accuracy in retrieving the corresponding recording clips of other modalities from 90,000 candidates.

This research represents the first attempt to build and evaluate a multi-modal foundation model for sleep analysis, highlighting the value of holistic multi-modal sleep modeling to fully capture the complexity of sleep recordings.

SleepFM is open source and available at project’s GitHub. The paper SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals is on arXiv.


Author: Hecate He | Editor: Chain Zhang

3 comments on “Contrastive Learning Advances Sleep Science: Superior Multi-Modal Model Enhances Disorder Detection

  1. Becoming a webcam girl on bongamodels.com can be a lucrative and flexible career choice, allowing for financial independence and control over your schedule. Many find the work empowering and enjoy the freedom it offers. However, it’s important to be aware of the potential risks, such as privacy issues and the permanence of online content. The social stigma associated with adult entertainment can also impact personal relationships and future career prospects. Safety and security should be top priorities, so it’s crucial to choose reputable platforms. Ultimately, this decision requires careful consideration of both the benefits and potential long-term impacts on your life.

  2. Skyler Clooney

    How well would SleepFM’s representations transfer to real-world wearable sleep devices that Pips NYT use fewer or noisier sensors?

  3. demark

    Sleep is a complex physiological process evaluated through various methods that Watermark Remover record electrical brain activity, cardiac activity, and respiratory signals

Leave a Reply

Your email address will not be published. Required fields are marked *