Blog post by Marco Altini
We've tried something new to bring VO2max estimates to cyclists and not only runners in HRV4Training [this feature will be available by the end of April, 2017] 🚴 🚴🏻♀️ 📱 🔬
In this post we provide an overview of VO2max, explaining what the estimate is good for, and our data driven approach to bring the same feature to cyclists using the app.
Let's start with the basics.
What's a VO2max estimate?
Direct measurement of oxygen volume during maximal exercise, or VO2max, is the gold standard for cardiorespiratory fitness assessment .
There are a series of practical limitations to VO2max testing (and other limitations due to common misunderstandings around this variable, please see the next section for more on this), for example the need for specialized personnel, expensive medical equipment, high motivational demands of the subject, health risks for subjects in non-optimal health conditions (which limits applicability), and so on . Even when testing conditions are not a problem, performing a maximal test until exhaustion just to monitor fitness level might interfere with your current training program.
For these reasons, scientists have been working on submaximal tests, or tests that do not require maximal effort and use easy to acquire parameters to determine VO2max. Any model providing a VO2max value based on parameters other than measuring your oxygen uptake during a test to exhaustion are estimates.
Submaxmial tests have been developed already more than 60 years ago to estimate VO2max during specific protocols while monitoring HR at predefined workloads . Basically, these tests rely on the inverse relation between fitness and HR, with higher HR (for a given workload) typically associated to lower fitness level and viceversa. Contextualizing heart rate (HR), e.g. determining the HR during specific activities, was a good step forward in terms of practical applicability, compared to maximal tests. However, some limitations still apply: the test needs to be re-performed every time that fitness needs to be assessed, still a pre-defined protocol is required,. etc.
Ideally, we would like to keep track of VO2max or cardiorespiratory fitness without the need to perform a specific test. As technology got better and we have plenty of sensors able to acquire accelerometer, GPS and HR data in free-living, during my PhD I've developed several machine learning models that would do just that, for the general population, so without even including exercise data (basically HR while walking at different intensities/locations as a predictor of fitness, see [4, 5, 6] for details).
My results as well as attempts from others that tried to estimate VO2max from rest data, for example HR or HRV, clearly show that using only rest data is insufficient to estimate VO2max with good accuracy . This is the reason why we haven't introduced VO2max estimation models before, and also why the feature was enabled only for runners using Strava and a HR monitor during training. Including workouts data and more specifically heart rate data at a certain workload / effort, is key in providing accurate estimates.
What is this estimate good for?
I will leave it to others to discuss the limitations of VO2max as a measure of human performance (see Magness, Noakes, and others that do a great job explaining the complexities of oxygen consumption, running efficiency, and how the scientific community has been giving a bit too much credit to this variable in the past decades) across individuals (and even within individuals).
What I would like to do here is to highlight how the estimate can be very informative both at the population and at the individual level, as what it relies on, is contextualized physiological data under submaximal effort, the real parameter of interest for us.
The idea is that tracking VO2max over time, as estimated by submaximal heart rate, can provide a proxy to performance/ fitness and therefore help you understand if you are getting in a better shape, and can potentially race faster, just by using available training data and therefore without putting additional stress on the body with specific tests.
For runners, cyclists or triathletes, for example as training improves aerobic capacity and heart rate lowers at a given intensity, VO2max estimates track well with improvements in fitness and performance as determined in racing events. This is true for athletes of any level, as you can easily find logs of ironman champions going through a base phase which gradually lowers their heart rate at easy intensities, as well as recreational athletes improving their fitness in a similar way.
So if submaximal heart rate (e.g. your heart rate while running at a certain pace) is the real variable of interest, why do we use it to estimate VO2max instead of just providing it?
The reason is to make it easier to interpret. Submaximal heart rate outside of lab settings means for example that we create a feature computed as pace / heart rate (as we can't get everyone to run at the same pace like you'd do in the lab) which is a number that represents fitness but is not 'meaningful to a human'. We introduced the pace to heart rate ratio in a recent publication to contextualize heart rate by effort (or workload) and showed that it provides the best predictor for VO2max (with respect to anthropometrics data and resting physiological data) .
Using this predictor to estimate vo2max brings things back to numbers we are more accustomed to, and a bit of standardization, I believe can help, regardless of all the flaws of vo2max. Once we understand why we estimate it, what is it based on, and what can be used for (e.g. track progress over time), this estimate can be a nice feature to look at from time to time to track changes in fitness.
How do you build a VO2max estimation model?
To build a model able to predict VO2max from certain parameters, you need to collect a dataset, including the following:
These parameters, also called predictors, need to be such that we can acquire them in unsupervised free living settings with minimal burden on the user, as we do not want user to have to do specific lab tests or protocols even in free-living. This is why we came up with the heart rate to pace ratio so that each unsupervised free living GPS workout collected via for example Strava could be used to estimate VO2max regardless of an athlete ability.
Why couldn't we develop the same for cyclists before? For the simple reason that while we had VO2max reference data, we did not have heart rate and power data while cycling in our dataset, hence we could not build a model between these variables and deploy it to new users.
User-generated data to the rescue
How did we overcome this problem? We've now tried something new to bring VO2max estimation to cyclists, by using a data driven approach and relying on our growing community of triathletes. In particular, we used triathletes data, as it includes both running and cycling, to model the relation between estimated VO2max (from free-living running data) and cycling-related variables, such as power and heart rate while riding.
In this way we could build new models to estimate VO2max relying ONLY on cycling-related variables, and deploy such models to all cyclists using the app.
Here is an overview of this approach:
In particular, here is an overview of the anthropometrics data of the included triathletes:
This dataset includes 400 people using HRV4Training for several months, and a minimum of 30 cycling workouts with power and heart rate and 30 running workouts with GPS and heart rate data, plus resting physiological measurements.
As we estimate VO2max from running variables and then use this estimate as our new reference, we first would like to make sure our new VO2max reference is a good proxy of human performance, as we've shown in our latest publication . Below is the relation between VO2max estimated from running variables and half marathon time on this dataset, showing once again a strong correlation and giving us confidence that the estimated VO2max can be used for this purpose:
Never forget that this is user-generated data acquired in uncontrolled free-living settings. We did not take 10 people and have them run a half marathon all out (classic sport-science study). We did not do any intervention or asked anyone to follow any protocol. Users trained according to their training plans, and we extracted the best half marathon time over periods of 3 months to a year depending on available data, resulting in the 400 data points shown above. Hence this data are noisy, some users might have never ran a half marathon at intense effort, others might have a noisy heart rate signal, maybe acquired via PPG with a watch. The large amount of data is such that regardless of the unsupervised settings we can capture a strong relation between estimated VO2max and running performance, if such relation exists, as shown above.
At this point we built different models and eventually settled on the best performing one, including age, BMI, gender, average power, average heart rate and resting heart rate as predictors of VO2max for cyclists.
Below is a cross-validation (leave one subject out) on triathletes data, where we used as reference the VO2max estimated from running data, and we predicted VO2max using cycling related variables, then compared the estimated VO2max from cycling parameters with the estimated VO2max from running parameters.
What we expect (more like hoped for) here is VO2max estimates coming from different sets of parameters (pace and heart rate for running and power and heart rate for cycling) to be very similar, as the goal of the estimate is to capture a person's fitness level, which is indeed the case looking at the plot above.
In this post we showed how we could rely on triathletes user-generated data to develop new features from data acquired only under uncontrolled free-living settings. Personally, I think this is one of the most interesting aspects when deploying a validated technology in the hands of thousand of people, and then trying to analyze data and build new models and features, going beyond what is possible in small scale clinical studies.
From a practical point of view, what matters here is the ability to capture submaximal physiological responses to a certain effort (pace or power), which can be translated in fitness level, and useful to track improvements in aerobic capacity over time. We do not care about the specificity of one test or the other (cycling or running), efficiency, muscle fatigue, economy, etc. as it is anyways impossible to take into account such differences while estimating VO2max from variables other than oxygen uptake during an actual maximal workout (and sometimes not even such test). The estimate should be used at the individual level to track changes over time, not as an absolute marker of fitness across individuals, as obviously there is much more to human performance than VO2max or submaximal heart rate, even though the estimate itself correlates quite well with actual running performance at the population level, there is much individual variability.
It is important to understand what this estimate is about, and what can be used for, and hopefully this post provides some clarity around the controversial world of VO2max measurement and estimation, giving you some confidence that you can use the estimate to track progress.
We hope you'll enjoy the feature.
 L. Vanhees, J. Lefevre, R. Philippaerts, M. Martens, W. Huygens, T. Troosters, and G. Beunen, “How to assess physical activity? how to assess physical fitness?” European Journal of Cardiovascular Prevention & Rehabilitation, vol. 12, no. 2, pp. 102–114, 2005.
 V. Noonan and E. Dean, “Submaximal exercise testing: clinical application and interpretation,” Physical Therapy, vol. 80, no. 8, pp. 782–807, 2000.
 P. O. Astrand and I. Ryhming, “A nomogram for calculation of aerobic capacity (physical fitness) from pulse rate during submaximal work,” Journal of Applied Physiology, vol. 7, no. 2, pp. 218–221, 1954.
 M. Altini, P. Casale, J. Penders, O. Amft, "Cardiorespiratory fitness estimation in free-living using wearable sensors" accepted for publication in Artificial Intelligence in Medicine. Full paper. 2016.
 M. Altini, P. Casale, J. Penders, O. Amft, "Cardiorespiratory fitness estimation using wearable sensors: laboratory and free-living analysis of context-specific submaximal heart rates". Accepted for publication in the Journal of Applied Physiology. Full paper. 2016.
 M. Altini, P. Casale, J. Penders, O. Amft, "Personalized Cardiorespiratory Fitness and Energy Expenditure Estimation Using Hierarchical Bayesian models" accepted for publication in the Journal of Biomedical Informatics. download pdf. 2015.
 Esco, Michael R., et al. "Cross-validation of the polar fitness testTM via the polar f11 heart rate monitor in predicting vo2 max." Journal of Exercise Physiology 14 (2011): 31-37.
 2017. M. Altini, O. Amft, "Relation Between Estimated Cardiorespiratory Fitness and Running Performance in Free-Living: an Analysis of HRV4Training Data", accepted for publication at BHI 2017. Full text here.
Register to the mailing list
and try the HRV4Training app!
1. Intro to HRV
2. How to use HRV, the basics
3. HRV guided training
4. The big picture
5. HRV and training load
6. HRV, strength & power
7. Overview in HRV4Training Pro
8. HRV in team sports
1. Context & Time of the Day
3. Paced breathing
4. Orthostatic Test
5. Slides HRV overview
6. Normal values and historical data
1a. Acute Changes in HRV
1b. Acute Changes in HRV (population level)
1c. Acute Changes in HRV & measurement consistency
1d. Acute Changes in HRV in endurance and power sports
2a. Interpreting HRV Trends
2b. HRV Baseline Trends & CV
3. Tags & Correlations
4. Ectopic beats & motion artifacts
5. HRV4Training Insights
6. HRV4Training & Sports Science
7. HRV & fitness / training load
8. HRV & performance
9. VO2max models
10. Repeated HRV measurements
11. VO2max and performance
12. HR, HRV and performance
13. Training intensity & performance
14. Publication: VO2max & running performance
15. Estimating running performance
16. Coefficient of Variation
17. More on CV and the big picture
18. Case study marathon training
19. Case study injury and lifestyle stress
20. HRV and menstrual cycle
21. Cardiac decoupling
22. FTP, lactate threshold, half and full marathon time estimates
23. Training Monotony
Camera & Sensors
1. ECG vs Polar & Mio Alpha
2a. Camera vs Polar
2b. Camera vs Polar iOS10
2c. iPhone 7+ vs Polar
2d. Comparison of PPG sensors
3. Camera measurement guidelines
4. Validation paper
5. Android camera vs Chest strap
6. Scosche Rhythm24
7. Apple Watch
9. Samsung Galaxy
1. Features and Recovery Points
2. Daily advice
3. HRV4Training insights
4. Sleep tracking
5. Training load analysis
6a. Integration with Strava
6b. Integration with TrainingPeaks
6c. Integration with SportTracks
6d. Integration with Genetrainer
6e. Integration with Apple Health
6f. Integration with Todays Plan
7. Acute HRV changes by sport
8. Remote tags in HRV4T Coach
9. VO2max Estimation
10. Acute stressors analysis
11. Training Polarization
12. Lactate Threshold Estimation
13. Functional Threshold Power(FTP) Estimation for cyclists
14. Aerobic Endurance analysis
15. Intervals Analysis
16. Training Planning
17. Integration with Oura
18. Aerobic efficiency and cardiac decoupling
1. HRV normal values
2. HRV normalization by HR
3. HRV 101