Tech & Gear General Endurance · · 9 min read

Garmin VO2max Accuracy vs Lab Testing: The 6.85% MAPE Truth (and Why It Matters)

Garmin VO2max hits 6.85% MAPE vs lab CPET — but that number hides a fitness-level trap: highly trained athletes face 10%+ error and a 6 ml/kg/min underestimate.

AO
AthleteOS Data Science
TL;DR — The Answer

Independent lab studies put Garmin VO2max at 6.85% MAPE overall, but moderately trained athletes get 2.8-4.1% error while highly trained athletes face 9.4-10.4% MAPE with a 6.3 ml/kg/min underestimate. Individual-level error spans ±9.83 ml/kg/min (INTERLIVE meta-analysis), meaning changes under 4-5 points cannot be trusted as real. Use Garmin VO2max for direction and trends, not absolute values.

Your watch says 54. The lab says 61. You trained harder than ever this block, and the number went down.

That’s the Garmin VO2max experience for a lot of competitive athletes. The number isn’t useless. But it isn’t what Garmin’s marketing implies either. Here’s what independent research actually shows, and how to use wearable VO2max without fooling yourself.

What the 6.85% MAPE Number Actually Means

MAPE stands for Mean Absolute Percentage Error. It’s the average miss, across a group of athletes, between what the watch estimated and what a metabolic cart measured in a lab.

A 2023 study by Carrier et al. tested the Garmin fēnix 6 against spirometry-based lab data in 21 athletes. The result: 6.85% MAPE, with a concordance correlation coefficient of 0.70. For a mid-fitness recreational runner with a true VO2max of 50 ml/kg/min, that’s an average miss of about 3.4 points. Not bad.

But MAPE is a group average. It hides the individual spread.

The INTERLIVE meta-analysis pooled 14 studies and 403 participants. Group-level bias was near zero (-0.09 ml/kg/min). That sounds precise. The individual-level limits of agreement, though, spanned -9.92 to +9.74 ml/kg/min. Translation: your specific reading, on a specific day, could be nearly 10 points too high or 10 points too low.

A lab CPET has a test-retest coefficient of variation of just 1.98%, an ICC of 0.984, and a minimum detectable difference of 2.14 ml/kg/min. Succi et al., 2023. Garmin’s individual limits of agreement span 19.66 ml/kg/min (-9.92 to +9.74). The lab’s 95% repeatability window is 2.14 ml/kg/min. The watch’s noise band is roughly 9 times wider. Think of the lab as a precise thermometer and the watch as a mood ring that usually points in the right direction.

Garmin VO2max Accuracy vs Lab: Error by Device and Scenario Garmin fēnix 6S (best case) 5.64% Garmin fēnix 6 (30s avg criterion) 7.05% Garmin Forerunner 245, mod. trained 2.8-4.1% Garmin Forerunner 245, highly trained 9.4-10.4% Garmin Forerunner 920XT 7.3% Polar V800 (resting algorithm) 13.2% Apple Watch Series 7 (overall) 15.79% Apple Watch Series 7 (fit athletes) 21.47% Lower MAPE = closer to lab. Sources: Carrier 2023, Engel 2025, Passler 2019, Caserman 2024.

The Fitness-Level Trap in Garmin VO2max Accuracy

Here’s what most watch reviews miss. Garmin doesn’t make one accuracy claim. It makes a spectrum of claims depending on who you are.

Engel et al., 2025 split 35 runners into two groups at a VO2max cut-off of 59.8 ml/kg/min. The results were striking:

Three independent studies tell the same story across different Garmin models and fitness-tier cut-offs:

StudyLow VO2max (<45 ml/kg/min)Moderate (45–55)High (>55–60)
Engel 2025 (Forerunner 245)2.8–4.1% MAPE9.4–10.4% MAPE
Düking 2022 (Forerunner 245)7.1% (overestimate)4.1% MAPE6.2% MAPE (underestimate)
Passler 2019 (Forerunner 920XT)7.3% overallHigher in fit subgroup

Sources: Engel et al. 2025, Düking et al. 2022, Passler et al. 2019.

Six points is not noise. A runner at a true 65 ml/kg/min might consistently see 58-59 on their watch. That’s a meaningful gap if you’re using the number to guide training intensity, set zones, or compare yourself to competitors.

The pattern makes sense once you understand the algorithm.

How Garmin’s Algorithm Actually Works

Garmin’s VO2max engine comes from Firstbeat Technologies. Here’s the core idea, stripped of marketing language.

The watch takes your GPS speed and your heart rate from each run segment. It uses a linear relationship between pace and oxygen cost to estimate how much oxygen each segment requires. Then it asks: “Given this pace and this heart rate, what VO2max does this athlete need to have?”

Each segment gets a reliability score. Segments where heart rate and speed are poorly correlated (downhill running, traffic stops, cardiovascular drift on long runs) get discarded. The final VO2max estimate is a reliability-weighted average of the segments that passed.

The problem for elite runners: their heart rate barely moves across a wide range of paces. The algorithm extrapolates from a narrow HR-speed relationship to a theoretical maximum, and that extrapolation breaks down at the top end. The watch guesses at a number it can’t actually observe, and it guesses low.

That explains the 6.3 ml/kg/min underestimate. The watch isn’t broken. It’s doing math with inputs that don’t have enough range to resolve the question accurately.

The Firstbeat white paper also notes that a 15 bpm error in your max HR setting inflates VO2max error by 7-9%. If your watch thinks your max HR is 185 and it’s actually 172, that error compounds everything downstream. Age-predicted formulas are unreliable. Set your max HR from a real max-effort workout.

What Garmin Gets Right: Device Comparison

Against other consumer options, Garmin holds up well. Passler et al., 2019 put the Garmin Forerunner 920XT at 7.3% MAPE and the Polar V800 at 13.2% in the same study. The difference comes from the algorithm type: Garmin uses an exercise-based approach that requires you to actually run. Polar’s older resting-based method guesses from HR at rest, which is far less informative.

Apple Watch Series 7 fares worst in the published data. Caserman et al., 2024 found 15.79% MAPE overall. In the excellent-fitness subgroup, it hit 21.47% with a -12.00 ml/kg/min underestimation. That’s not a wearable fitness metric. That’s a rough directional indicator at best.

DeviceAlgorithm TypeOverall MAPEFit Athlete MAPEBias Direction
Garmin fēnix 6 (1-min avg)Exercise-based6.85%~10% (estimated)Underestimates fit athletes
Garmin Forerunner 245, mod. trainedExercise-based~4.1%Near zero
Garmin Forerunner 245, highly trainedExercise-based~10.4%10.4%-6.3 ml/kg/min
Garmin Forerunner 920XTExercise-based7.3%-2.1 ml/kg/min
Polar V800Resting-based13.2%+3.0 ml/kg/min
Apple Watch Series 7Resting-based15.79%21.47%-12.0 ml/kg/min

Sources: Carrier 2023, Engel 2025, Passler 2019, Caserman 2024.

Garmin wins this comparison. That’s not the same as saying Garmin is accurate enough to replace a lab.

Case Study: Marcus, 41, Ironman Athlete

Marcus had a true lab VO2max of 63 ml/kg/min, confirmed at a sports medicine clinic before his Ironman build. His Garmin Forerunner 945 read 57, then 56, then 58 across three months of solid training. He was confused. His fitness score (CTL) in AthleteOS was climbing from 72 to 94. His pace at threshold was improving by 12 seconds per mile. But the VO2max number barely moved.

He wasn’t getting less fit. He was getting more fit. Garmin was just underestimating him by the 6-7 points consistent with what Engel’s research predicts for his fitness tier.

His mistake was anchoring on the watch number. When he switched to tracking pace-at-threshold and his fitness score trend, the picture became clear. He raced to a 4:52 Ironman bike split on a goal of 5:00. His VO2max number that day? The watch said 57.

The watch wasn’t tracking his fitness. His fitness was outrunning the watch’s ability to measure it.

The Noise Floor: What Change Is Real?

Given individual LoA of ±9.83 ml/kg/min, a meaningful change threshold needs to be higher than typical week-to-week fluctuation.

Conservative rule: a real change requires movement of at least 4-5 ml/kg/min sustained across multiple sessions over 6-8 weeks. Not a 1-point Tuesday-to-Sunday shift.

Meaningful Garmin VO2max change = sustained movement of 4-5 points over 6-8 weeks
Lab CPET meaningful change = 2.14 ml/kg/min (minimum detectable difference)

A lab can detect a 2-point improvement. Your watch needs a 4-5 point shift before you can trust it’s real. That’s a useful filter, not a reason to throw the watch away.

What your watch can do well: track a 12-month trajectory. A reader who was at 43 a year ago and is now reading 51 has almost certainly made real aerobic gains. The trend over months is more reliable than any single value.

Understanding the aerobic base building that drives VO2max improvements helps you interpret these trends correctly. So does tracking training load with CTL and ATL, which gives you parallel confirmation that real fitness is accumulating.

How AthleteOS Handles Noisy VO2max Data

AthleteOS treats your Garmin VO2max as a noisy estimator. It doesn’t report the raw number as ground truth.

Instead, AthleteOS’s session analysis cross-references three signals:

  1. Garmin VO2max trend over a rolling 6-week window
  2. Pace-at-threshold change across comparable workouts
  3. HRV baseline trend (rising HRV with rising load = positive adaptation)

When Garmin VO2max rises but pace-at-threshold isn’t moving, AthleteOS flags it as a likely estimator artifact. When Garmin VO2max is flat but pace-at-threshold is improving alongside a rising fitness score, AthleteOS surfaces the threshold trend as the real signal and notes that the watch is probably underestimating a fit athlete.

This is exactly the scenario that affects athletes above 60 ml/kg/min, where Garmin’s MAPE exceeds 10%.

The chest strap vs optical HR accuracy question matters here too. Garmin’s algorithm uses heart rate as its primary input. Optical HR errors from wrist sensors compound the VO2max error. Using a chest strap during the runs that update your VO2max estimate improves accuracy meaningfully, particularly at high intensities where optical sensors lose reliability.

Your VO2max estimate is only as good as your HR data. And your HR data is only as good as how you measured it.


The watch gives you a direction. The lab gives you coordinates. Know which one you’re holding before you navigate.


Start tracking the metrics that actually move with training at AthleteOS and see how your Garmin data compares to your real fitness trend.

Frequently Asked Questions

How accurate is Garmin VO2max compared to a lab test?

Overall MAPE is around 6.85% in independent studies. That sounds fine, but individual-level error spans ±9.83 ml/kg/min, meaning any single reading could be nearly 10 points off for you specifically.

Does Garmin VO2max underestimate or overestimate?

Both, depending on your fitness. Low-fitness athletes (VO2max under 45) tend to get overestimates. High-fitness athletes (VO2max over 55-60) get underestimates of up to 6.3 ml/kg/min. The middle range (45-55) is where Garmin is most accurate.

What is a meaningful change in Garmin VO2max?

Given individual-level noise of ±9.83 ml/kg/min, a single-point week-to-week change is not meaningful. Look for sustained movement of 4-5 points or more across 6-8 weeks of consistent training.

Is Garmin VO2max more accurate than Apple Watch?

Yes, significantly. Apple Watch Series 7 shows 15.79% MAPE overall and 21.47% error for fit athletes. Garmin's Firstbeat-based exercise algorithm consistently outperforms Apple's resting-based approach across every fitness tier.

Does max heart rate affect Garmin VO2max accuracy?

Yes. A 15 bpm error in your max HR setting inflates VO2max error by 7-9%. Set your max HR from a real maximal effort, not age-predicted formulas.

Can I use Garmin VO2max to qualify for races or predict marathon time?

Not directly. The individual error is too wide for precise predictions. It works well as a trend indicator over weeks and months, but pair it with pace-at-threshold and HRV data for a more reliable fitness picture.

#vo2max#garmin#wearables#fitness-testing#lactate-threshold#cpet

Stop guessing whether your fitness is changing.

AthleteOS triangulates your Garmin VO2max against pace-at-threshold and HRV trends to separate real fitness gains from watch noise. Connect your Garmin and see what's actually moving.

Generate Your Free AI Plan
14-day free trial · No credit card required