Validity Apple Watch
🔬 Wearables in science: How accurate is the Apple Watch really?
A new living systematic review & meta-analysis (82 studies, >430,000 participants) provides one of the most comprehensive evaluations to date.
Here is the bottom line for researchers and physically active individuals:
📊 Key findings
❤️ Heart rate
→ High agreement with criterion methods
→ Small bias (~ -0.27 bpm), but relevant variability (≈ ±7 bpm)
→ Accuracy decreases with movement complexity and intensity
🫁 Blood oxygen saturation (SpO₂)
→ Low average error
→ BUT wide limits of agreement (≈ ±4%)
→ Reduced validity in hypoxic ranges
🫀 Atrial fibrillation detection
→ High specificity (0.91)
→ Moderate sensitivity (0.79)
→ Many inconclusive readings → not negligible in practice
🔥 Energy expenditure
→ Poor validity
→ Errors frequently >20%
→ Not suitable for precise quantification
😴 Sleep
→ Good sleep vs wake detection
→ Weak differentiation of sleep stages
👣 Steps & activity metrics
→ Moderate accuracy
→ Context-dependent error
🧠 Interpretation for practice and research
⚙️ Metric matters
Direct physiological signals (e.g., HR via PPG) outperform derived metrics (e.g., energy expenditure).
🏃 Context matters
Accuracy declines with motion artefacts, intensity, and environmental factors.
🧬 Individual matters
Physiology (e.g., perfusion, skin contact) systematically influences measurement error.
📌 Practical implications
✔️ Useful for longitudinal monitoring and trends
✔️ Applicable for population-level research
⚠️ Limited for clinical decision-making without validation
❌ Not appropriate for precise energy expenditure or VO₂max assessment
📉 Take-home message
Wearables are not inherently valid or invalid.
They are metric-specific tools with context-dependent accuracy.
💬 Discussion point
Where should we currently draw the line between
👉 “good enough for practice”
vs
👉 “valid for science or clinical use”?
Full article: https://pubmed.ncbi.nlm.nih.gov/41513748/




