Detecting sleep outside the clinic using wearable heart rate devices (original) (raw)

Detecting sleep using heart rate and motion data from multisensor consumer-grade wearables, relative to wrist actigraphy and polysomnography

Sleep

Study Objectives Multisensor wearable consumer devices allowing the collection of multiple data sources, such as heart rate and motion, for the evaluation of sleep in the home environment, are increasingly ubiquitous. However, the validity of such devices for sleep assessment has not been directly compared to alternatives such as wrist actigraphy or polysomnography (PSG). Methods Eight participants each completed four nights in a sleep laboratory, equipped with PSG and several wearable devices. Registered polysomnographic technologist-scored PSG served as ground truth for sleep–wake state. Wearable devices providing sleep–wake classification data were compared to PSG at both an epoch-by-epoch and night level. Data from multisensor wearables (Apple Watch and Oura Ring) were compared to data available from electrocardiography and a triaxial wrist actigraph to evaluate the quality and utility of heart rate and motion data. Machine learning methods were used to train and test sleep–wake...

Combining wearable and environmental sensing into an unobtrusive tool for long-term sleep studies

IHI'12 - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, 2012

Long-term sleep monitoring of patients has been identified as a useful tool to observe sleep trends manifest themselves over weeks or months for use in behavioral studies. In practice, this has been limited to coarse-grained methods such as actigraphy, for which the levels of activity are logged, and which provide some insight but have simultaneously been found to lack accuracy to be used for studying sleeping disorders . This paper presents a method to automatically detect the user's sleep at home on a long-term basis. Inertial, ambient light, and time data tracked from a wrist-worn sensor, and additional night vision footage is used for later expert inspection. An evaluation on over 4400 hours of data from a focus group of test subjects demonstrates a high recall night segment detection, obtaining an average of 94%. Further, a clustering to visualize reoccurring sleep patterns is presented, and a myoclonic twitch detection is introduced, which exhibits a precision of 74%. The results indicate that long-term sleep pattern detections are feasible.

Multi-Night at-Home Evaluation of Improved Sleep Detection and Classification with a Memory-Enhanced Consumer Sleep Tracker

Nature and Science of Sleep

To evaluate the benefits of applying an improved sleep detection and staging algorithm on minimally processed multisensor wearable data collected from older generation hardware. Patients and Methods: 58 healthy, East Asian adults aged 23-69 years (M = 37.10, SD = 13.03, 32 males), each underwent 3 nights of PSG at home, wearing 2 nd Generation Oura Rings equipped with additional memory to store raw data from accelerometer, infra-red photoplethysmography and temperature sensors. 2-stage and 4-stage sleep classifications using a new machine-learning algorithm (Gen3) trained on a diverse and independent dataset were compared to the existing consumer algorithm (Gen2) for whole-night and epoch-by-epoch metrics. Results: Gen 3 outperformed its predecessor with a mean (SD) accuracy of 92.6% (0.04), sensitivity of 94.9% (0.03), and specificity of 78.5% (0.11); corresponding to a 3%, 2.8% and 6.2% improvement from Gen2 across the three nights, with Cohen's d values >0.39, t values >2.69, and p values <0.01. Notably, Gen 3 showed robust performance comparable to PSG in its assessment of sleep latency, light sleep, rapid eye movement (REM), and wake after sleep onset (WASO) duration. Participants <40 years of age benefited more from the upgrade with less measurement bias for total sleep time (TST), WASO, light sleep and sleep efficiency compared to those ≥40 years. Males showed greater improvements on TST and REM sleep measurement bias compared to females, while females benefitted more for deep sleep measures compared to males. Conclusion: These results affirm the benefits of applying machine learning and a diverse training dataset to improve sleep measurement of a consumer wearable device. Importantly, collecting raw data with appropriate hardware allows for future advancements in algorithm development or sleep physiology to be retrospectively applied to enhance the value of longitudinal sleep studies.

A Validation Study of a Commercial Wearable Device to Automatically Detect and Estimate Sleep

Biosensors

The aims of this study were to: (1) compare actigraphy (ACTICAL) and a commercially available sleep wearable (i.e., WHOOP) under two functionalities (i.e., sleep auto-detection (WHOOP-AUTO) and manual adjustment of sleep (WHOOP-MANUAL)) for two-stage categorisation of sleep (sleep or wake) against polysomnography, and; (2) compare WHOOP-AUTO and WHOOP-MANUAL for four-stage categorisation of sleep (wake, light sleep, slow wave sleep (SWS), or rapid eye movement sleep (REM)) against polysomnography. Six healthy adults (male: n = 3; female: n = 3; age: 23.0 ± 2.2 yr) participated in the nine-night protocol. Fifty-four sleeps assessed by ACTICAL, WHOOP-AUTO and WHOOP-MANUAL were compared to polysomnography using difference testing, Bland–Altman comparisons, and 30-s epoch-by-epoch comparisons. Compared to polysomnography, ACTICAL overestimated total sleep time (37.6 min) and underestimated wake (−37.6 min); WHOOP-AUTO underestimated SWS (−15.5 min); and WHOOP-MANUAL underestimated wake ...

Deconstructing Commercial Wearable Technology: Contributions toward Accurate and Free-Living Monitoring of Sleep

Sensors (Basel, Switzerland), 2021

Despite prolific demands and sales, commercial sleep assessment is primarily limited by the inability to “measure” sleep itself; rather, secondary physiological signals are captured, combined, and subsequently classified as sleep or a specific sleep state. Using markedly different approaches compared with gold-standard polysomnography, wearable companies purporting to measure sleep have rapidly developed during recent decades. These devices are advertised to monitor sleep via sensors such as accelerometers, electrocardiography, photoplethysmography, and temperature, alone or in combination, to estimate sleep stage based upon physiological patterns. However, without regulatory oversight, this market has historically manufactured products of poor accuracy, and rarely with third-party validation. Specifically, these devices vary in their capacities to capture a signal of interest, process the signal, perform physiological calculations, and ultimately classify a state (sleep vs. wake) o...

Assessing the performance of a commercial multisensory sleep tracker

PLOS ONE, 2020

Wearable sleep technology allows for a less intruding sleep assessment than PSG, especially in long-term sleep monitoring. Though such devices are less accurate than PSG, sleep trackers may still provide valuable information. This study aimed to validate a commercial sleep tracker, Garmin Vivosmart 4 (GV4), against polysomnography (PSG) and to evaluate intra-device reliability (GV4 vs. GV4). Eighteen able-bodied adults (13 females, M = 56.1 ± 12.0 years) with no self-reported sleep disorders were simultaneously sleep monitored by GV4 and PSG for one night while intra-device reliability was monitored in one participant for 23 consecutive nights. Intra-device agreement was considered sufficient (observed agreement = 0.85 ± 0.13, Cohen’s kappa = 0.68 ± 0.24). GV4 detected sleep with high accuracy (0.90) and sensitivity (0.98) but low specificity (0.28). Cohen’s kappa was calculated for sleep/wake detection (0.33) and sleep stage detection (0.20). GV4 significantly underestimated time a...

The Virtual Sleep Lab—A Novel Method for Accurate Four-Class Sleep Staging Using Heart-Rate Variability from Low-Cost Wearables

Sensors

Sleep staging based on polysomnography (PSG) performed by human experts is the de facto “gold standard” for the objective measurement of sleep. PSG and manual sleep staging is, however, personnel-intensive and time-consuming and it is thus impractical to monitor a person’s sleep architecture over extended periods. Here, we present a novel, low-cost, automatized, deep learning alternative to PSG sleep staging that provides a reliable epoch-by-epoch four-class sleep staging approach (Wake, Light [N1 + N2], Deep, REM) based solely on inter-beat-interval (IBI) data. Having trained a multi-resolution convolutional neural network (MCNN) on the IBIs of 8898 full-night manually sleep-staged recordings, we tested the MCNN on sleep classification using the IBIs of two low-cost (

Performance of Four Commercial Wearable Sleep-Tracking Devices Tested Under Unrestricted Conditions at Home in Healthy Young Adults

Nature and Science of Sleep, 2022

Commercial wearable sleep-tracking devices are growing in popularity and in recent studies have performed well against gold standard sleep measurement techniques. However, most studies were conducted in controlled laboratory conditions. We therefore aimed to test the performance of devices under naturalistic unrestricted home sleep conditions. Participants and Methods: Healthy young adults (n = 21; 12 women, 9 men; 29.0 ± 5.0 years, mean ± SD) slept at home under unrestricted conditions for 1 week using a set of commercial wearable sleep-tracking devices and completed daily sleep diaries. Devices included the Fatigue Science Readiband, Fitbit Inspire HR, Oura ring, and Polar Vantage V Titan. Participants also wore a research-grade actigraphy watch (Philips Respironics Actiwatch 2) for comparison. To assess performance, all devices were compared with a high performing mobile sleep electroencephalography headband device (Dreem 2). Analyses included epoch-by-epoch and sleep summary agreement comparisons. Results: Devices accurately tracked sleep-wake summary metrics (ie, time in bed, total sleep time, sleep efficiency, sleep latency, wake after sleep onset) on most nights but performed best on nights with higher sleep efficiency. Epoch-by-epoch sensitivity (for sleep) and specificity (for wake), respectively, were as follows: Actiwatch (0.95, 0.35), Fatigue Science (0.94, 0.40), Fitbit (0.93, 0.45), Oura (0.94, 0.41), and Polar (0.96, 0.35). Sleep stage-tracking performance was mixed, with high variability. Conclusion: As in previous studies, all devices were better at detecting sleep than wake, and most devices compared favorably to actigraphy in wake detection. Devices performed best on nights with more consolidated sleep patterns. Unrestricted sleep TIB differences were accurately tracked on most nights. High variability in sleep stage-tracking performance suggests that these devices, in their current form, are still best utilized for tracking sleep-wake outcomes and not sleep stages. Most commercial wearables exhibited promising performance for tracking sleep-wake in real-world conditions, further supporting their consideration as an alternative to actigraphy.

Towards Benchmarked Sleep Detection with Wrist-Worn Sensing Units

2014 IEEE International Conference on Healthcare Informatics, 2014

The monitoring of sleep by quantifying sleeping time and quality is pivotal in many preventive health care scenarios. A substantial amount of wearable sensing products have been introduced to the market for just this reason, detecting whether the user is either sleeping or awake. Assessing these devices for their accuracy in estimating sleep is a daunting task, as their hardware design tends to be different and many are closed-source systems that have not been clinically tested. In this paper, we present a challenging benchmark dataset from an open source wrist-worn data logger that contains relatively high-frequent (100Hz) 3D inertial data from 42 sleep lab patients, along with their data from clinical polysomnography. We analyse this dataset with two traditional approaches for detecting sleep and wake states and propose a new algorithm specifically for 3D acceleration data, which operates on a principle of Estimation of Stationary Sleep-segments (ESS). Results show that all three methods generally over-estimate for sleep, with our method performing slightly better (almost 79% overall median accuracy) than the traditional activity count-based methods.

BiHeartS: Bilateral Heart Rate from multiple devices and body positions for Sleep measurement Dataset

arXiv (Cornell University), 2023

Sleep is the primary mean of recovery from accumulated fatigue and thus plays a crucial role in fostering people's mental and physical well-being. Sleep quality monitoring systems are often implemented using wearables that leverage their sensing capabilities to provide sleep behaviour insights and recommendations to users. Building models to estimate sleep quality from sensor data is a challenging task, due to the variability of both physiological data, perception of sleep quality, and the daily routine across users. This challenge gauges the need for a comprehensive dataset that includes information about the daily behaviour of users, physiological signals as well as the perceived sleep quality. In this paper, we try to narrow this gap by proposing Bilateral Heart rate from multiple devices and body positions for Sleep measurement (BiHeartS) dataset. The dataset is collected in the wild from 10 participants for 30 consecutive nights. Both research-grade and commercial wearable devices are included in the data collection campaign. Also, comprehensive self-reports are collected about the sleep quality and the daily routine.