A major expansion of real-world health data
The U.S. research ecosystem gained a notable new data resource this week with the publication of the All of Us Research Program’s wearables dataset in Nature Medicine. According to the paper, the dataset contains Fitbit data from more than 59,000 participants spanning 14 years, including more than 39 million step observations and 31 million sleep observations. Nearly half of the participants with Fitbit data also contributed electronic health records, physical measurements, genomics, and survey data.
That combination makes the release more than a large collection of consumer-device readouts. It creates a multimodal dataset that can potentially connect everyday behavioral and physiological signals to clinical outcomes, demographic context, and molecular data. For researchers studying digital biomarkers, sleep, exercise, chronic disease risk, and population health, the scope is significant.
Why this dataset matters
Wearables have long been seen as a way to move medical research beyond snapshots taken during clinic visits. Devices can capture continuous, real-world information about movement, sleep, and behavior over time. But many wearable datasets have a major weakness: they are often biased toward populations already more likely to buy and use such devices, typically wealthier and less diverse groups.
The All of Us paper explicitly addresses that problem. The authors frame the resource as one of the largest and most demographically rich digital health technology datasets assembled so far. The program’s mission has been to build a research cohort that can better reflect populations historically underrepresented in biomedical research. If the wearable component succeeds on those terms, it could help narrow one of digital medicine’s most persistent gaps: the mismatch between who generates the data and who is meant to benefit from the resulting insights.
Scale plus linkage is the key advantage
Large numbers alone do not make a dataset transformative. What elevates this release is the linkage. The paper says 46% of participants with Fitbit data also contributed electronic health records, physical measurements, genomics, and survey data. That means researchers can potentially study not only whether activity or sleep patterns vary across individuals, but whether those patterns track with diagnoses, treatment history, lab values, reported experiences, and genetic information.
In practical terms, that opens several research paths. Scientists can examine how digital measures relate to disease onset, progression, or recovery. They can test whether behavioral patterns differ across demographic groups in ways that matter for risk prediction. They can also evaluate whether signals derived from wearables perform consistently across populations, which is essential if digital biomarkers are going to support precision health rather than deepen existing inequities.
The paper describes the dataset as enabling research into relationships between digital health metrics and clinical outcomes while advancing digital health methodology through size, representation, and multimodal linkage. That is a careful way of saying the resource is useful both for studying disease and for stress-testing the methods behind digital health itself.
What researchers can learn from continuous data
Step counts and sleep records may sound simple, but when captured at scale over long periods they can become analytically powerful. Activity patterns can be associated with cardiovascular risk, metabolic disease, recovery trajectories, aging, and mental health. Sleep data can inform studies of circadian disruption, chronic illness burden, and links between rest patterns and downstream medical outcomes.
Because the dataset spans years, it may also help researchers study change rather than just status. Longitudinal data can reveal whether declining activity precedes diagnosis, whether sleep disruption accompanies treatment, or whether intervention effects appear in everyday life before they show up in traditional endpoints. That kind of temporal detail is one reason digital health data has attracted so much attention.
Still, the paper’s contribution is not a clinical claim that any one metric predicts a specific disease. It is the release of infrastructure: a dataset large enough and varied enough to let many groups test such questions rigorously.
The inclusion challenge in digital health
The authors note that digital health research has often been constrained by demographic bias. That challenge has implications far beyond fairness. If wearable data is disproportionately drawn from narrow populations, models built from it may generalize poorly. A digital biomarker that appears robust in one group may underperform in another. A prediction tool can look precise while embedding hidden blind spots.
By expanding the demographic reach of device-based data collection, All of Us is trying to change that starting point. The dataset will not, by itself, eliminate bias in research practice or model development. But it can make it harder to ignore representation as a methodological issue. In that sense, the release is important both scientifically and institutionally: it puts more responsibility on researchers to examine who their models work for.
What comes next
The true impact of the dataset will depend on how it is used. Resource papers often mark the beginning rather than the end of a story. The next phase will be shaped by the studies that draw on these records and by how carefully investigators handle issues such as missingness, device variation, behavioral confounding, and the limits of consumer-grade measurements.
Even so, the publication signals a maturing stage for digital health research. Instead of relying mainly on small proprietary datasets or narrowly recruited cohorts, scientists increasingly have access to large, linked, and more representative sources of real-world data. That changes what kinds of questions can be asked with credibility.
For the broader precision-health agenda, this is the point. Wearables are often marketed as personal wellness tools, but their larger scientific value lies in what they can reveal across populations over time when paired with robust clinical context. The All of Us release brings that possibility closer to routine research use.
A foundational resource rather than a headline result
There is no single blockbuster medical finding attached to this paper, and that is exactly why it matters. Foundational datasets rarely produce the most dramatic immediate headlines, but they often shape the next wave of discovery. By documenting a large wearable dataset with broad demographic scope and substantial linkage to other health data, the All of Us Research Program has created a resource that could influence digital medicine, epidemiology, and precision health for years.
Its value will ultimately be measured not by the number of device records alone, but by whether those records help produce better, more inclusive science. This release gives researchers the raw material to try.
This article is based on reporting by Nature Medicine. Read the original article.
Originally published on nature.com





