Introduction
In sports, physical conditioning is crucial to optimize teams’ and players’ performances [1]. In terms of performance, fewer injuries in a team often lead to a higher rank in the league [2]. On the other hand, a long-term injury causes a financial burden on the team to support such players who have no playtime [3]. Intense activities, such as sprints with acceleration and deceleration [4], change in directions [5], kicks, and shots [6] are frequently performed. These various activities involve eccentric muscle contractions, which affect physical damage [4], [7], [8], [9]. We designed movement feature data to consider various activities that could injure, for example, muscle, joint, and ankle. When the metabolic cost was exceeding recovery, fatigue was accumulated [10]. Changes in acceleration resulted in changes in stride, which increased metabolic cost [10]. In particular, the jerk was used as an indicator to calculate the efficiency of the metabolic cost [11]. In terms of metabolic cost and fatigue, we supposed that speed, acceleration, and jerk were indicators.
During training sessions or matches, the workload has been used as an essential indicator to optimize players’ and teams’ performances [12], [13], [14], [15]. This consists of external and internal workloads. The external workload represents the amount of physical activity that can be measured by the EPTS such as optical tracking systems or global positioning systems. Typically, it includes aggregated metrics such as total running distance, total playing time, high-speed running distance. Therefore, if players perform the same activity, the external workloads are said to be the same. The internal workload, however, is the psycho-physiological response to the activity depending on individual differences.
The RPE has been increasingly used to measure the internal workload, as it is a simple and non-invasive questionnaire that does not require any specific equipment for measurement [16], [17]. The RPE is highly correlated in submaximal-intensity exercise than other internal workload indicators such as VO2 max and heart rate [12], [18].
However, the RPE has the following limitations. First, since the RPE is a self-reported survey, there is an individual bias. Second, the RPE should be collected in independent environments to prevent the anchoring effect. Third, the RPE should be collected within a certain period after training or match to prevent memory decay or distortion. To overcome these limitations, we may choose to predict the internal workload such as the RPE of players based on the external workload instead of collecting internal workload directly from players. If one can develop such an accurate RPE predictor, then the predicted RPE would not be affected by the individual bias or anchoring effect as the prediction will be performed solely on objective data. Moreover, the predicted RPE is not influenced by the timing of the data collection.
Several previous works have tried to find a relationship between internal and external workloads in soccer [16], [19], [20]. They confirmed that external workload metrics such as total distance or number of accelerations, correlated well with the RPE. Furthermore, many recent approaches have attempted to predict the RPE using external workload metrics [14], [21], [22], [23]. Especially, [14], [21] suggested methods to forecast the RPE using external workload metrics based on machine learning algorithms. Unfortunately, the models end to overestimate when the RPE is low and underestimate when the RPE is high. In these studies, the authors rely on the handcrafted features such as distances covered in certain speed zones or during explosive actions to predict the RPE. However, the features may lose hidden vital information from the raw data as the features are crafted based on prior domain knowledge and simple aggregate methods.
This study proposes a deep learning algorithm to predict the RPE from the raw movement data instead of aggregated features. Inspired by several studies that applied Convolutional Neural Networks (CNN) to sequential data [24], [25], [26], we suggest a deep neural architecture called the FatigueNet that consists of deep convolutional layers followed by three Gated Recurrent Units (GRU) which takes raw time-series spatio-temporal data as input instead of handcrafted features. We demonstrated that the proposed FatigueNet accurately predicts the RPE only using the movement features through extensive experiments.
We also applied gradient regression activation mapping (RAM) to overcome the lack of explainability. By localizing discriminative regions of the input time-series using RAM, we could retrieve the time intervals that highly affect the predictions. this makes the fatigue monitoring automated, objective, and interpretable enough to be used by real-world coaches.
Preliminaries
A. Subjects
Thirty soccer players (Forwards = 7, Midfielders = 14, Defenders = 9, Age = 25.97 ± 4.39, Height = 181.07 ± 5.54 cm, Weight = 73.20 ± 6.39 kg) of a team in K League 1, which is the first division of the Korean professional soccer league, participated in this study by consenting to give their data during the 2019 season. This study selected them as subjects by inclusion and exclusion criteria where players should be at least 18 years of age, and both GPS data and RPE for each player should be simultaneously collected.
B. Data Acquisition
We attached 10 Hz wearable GPS devices which are OhCoach Cell B developed by Fitogether to the players’ upper backs during teams’ physical activities. The devices successfully collected the latitudes and longitudes from a total of 163 training sessions and 39 matches.
After each activity, the players are asked to record their RPE without the anchoring effect. They answered how strenuous the activity was on a scale of 1 to 10 in a blind study without the influence of other teammates.
C. Feature Engineering
In this section, we describe the detailed process of generating features and labels for the FatigueNet. First, time-series of latitudes and longitudes in the raw GPS data are transformed to local x and y coordinates
Many studies reported that high-speed sprints, explosive accelerations, and sudden changes in moving direction significantly affect fatigue [4], [7], [8], [9]. As such, we constructed a feature vector by generating three linear features and three angular features. First, we computed the 2D velocity \begin{align*} \vec {s}(t)=&(s_{x}(t),\quad s_{y}(t)) \in \mathbb {R}^{2} \tag{1}\\ \vec {v}(t)=&\frac {\Delta \vec {s}(t)}{\Delta t},\quad \vec {a}(t) = \frac {\Delta \vec {v}(t)}{\Delta t},~\vec {j}(t) = \frac {\Delta \vec {a}(t)}{\Delta t} \tag{2}\\ v(t)=&\|\vec {v}(t)\|,\quad a(t) = \|\vec {a}(t)\|,~j(t) = \|\vec {j}(t)\| \tag{3}\\ \omega (t)=&\frac {\Delta \theta (t)}{\Delta t},\quad \alpha (t) = \frac {\Delta \omega (t)}{\Delta t},~\zeta (t) = \frac {\Delta \alpha (t)}{\Delta t} \tag{4}\end{align*}
Finally, each feature sequence is paired with the corresponding RPE score between 1 and 10. These pairs are respectively used as inputs and labels of the convoulutional architecture introduced as the below.
Proposed Framework
A. Model Construction
The CNN is a deep learning architecture that embeds matrix-shaped features based on the weight-sharing convolution filters that slide along the input features. In general, the CNN consists of multiple blocks of convolutional layers followed by activation layers and pooling layers. Each convolutional layer has a receptive field that learns patterns of local pixels. Furthermore, each pooling layer plays the role of compressing the spatial resolution of a feature map. We also used dropout layers to improve the generalized performance of the FatigueNet. The last part of the model is three Gated Recurrent Units (GRU) followed by a fully connected layer. In the proposed FatigueNet, the GRU layers capture the sequential information of the spatio-temporal data after being locally compressed by the CNN layers. Table 1 shows the architecture of the FatigueNet.
B. Model Validation
To validate the FatigueNet, we use Stratified Cross-validation (Stratified-CV). Dealing with the imbalance issue of the data, Stratified-CV was performed by dividing the whole shuffled data into ten pieces. For performance comparison, the general models of machine learning, which consist of lightgbm, random forest, gradient boosting, knn, linear regression, ridge regression, bayesian ridge, Adaboost, decision tree, elastic net, and lasso regression, were fitted to the same dataset. In addition, we use external workload metrics such as total duration, speed zone duration, total distance, speed zone distance, max speed, number of sprints, distance of sprints, number of accelerations, and distance of accelerations as shown in Table 2. These metrics are selected by domain experts based on [4], [7], [8], [9], [14], [21], [22], and [23]. The hyperparameters of each machine learning algorithm are determined by Grid Search. For the implementation of the FatigueNet, we employed Python 3.9 and Pytorch 1.8. We also chose Adam optimizer for optimizing the proposed FatigueNet using MAE as a loss function. Compared with the general models of machine learning, the performance of the FatigueNet was evaluated through MAE and RMSE. Each evaluation metric is computed as follows:\begin{align*} MAE(y,\hat {y})=&\frac {1}{N}\sum _{i=1}^{N}\mid y_{i} - \hat {y_{i}}\mid \tag{5}\\ RMSE(y,\hat {y})=&\sqrt {\frac {1}{N}\sum _{i=1}^{N}(y_{i} - \hat {y_{i}})^{2}} \tag{6}\end{align*}
C. Fatigue Visualization
To visualize the model prediction basis, we utilized a gradient-weighted regression activation mapping (GRAD-RAM) which is inspired by gradient-weighted class activation mapping (GRAD-CAM) [27] and RAM [28]. Our GRAD-RAM reconstructs the backward gradient flow. We can estimate the fatigue accumulated regions from the backward gradient flow in the GRAD-RAM. With the GRAD-RAM, we can infer why the player reports such an RPE score after the session.
Experimental Results
A. Data Description
A total of 3,400 RPE and GPS pairs were analyzed (Mean ± SD; RPE = 5.28 ± 2.36). Fig. 2 shows the distribution of the RPE. The number of sessions that recorded extremely high (9-10) and low (1-2) RPE is relatively fewer than that recorded medium (3-8) RPE. The imbalance of training and match sessions caused a left-skewed distribution (skewness = 0.37). In general, match sessions recorded higher RPE scores than training sessions. The match session took place once a week, and the training session took place four or five times a week. So, that ratio is about 1 to 4. In this study, we did not consider the type of sessions.’
B. Comparison Results
Using ten-fold Stratified-CV, the FatigueNet recorded MAE = 0.8494 ± 0.0557 and RMSE = 1.2166 ± 0.0737. The general models of machine learning, which consist lightgbm, random forest, gradient boosting, knn, linear regression, ridge regression, bayesian ridge, Adaboost, decision tree, elastic net, and lasso regression, were compared to our FatigueNet. The machine learning models and FatigueNet used the same dataset, and then the FatigueNet outperformed the overall performance of machine learning models as shown in Table 3. Furthermore, we performed serial ablation studies to verify conformity of movement features to predict the RPE. Two models with the selected feature subsets were tested. The FatigueNet-C was trained with players’ coordinates. The FatigueNet-F was trained with coordinates and movement features. The FatigueNet only with movement features recorded the highest performance over the FatigueNet-C and FatigueNet-F. So, we could choose linear and angular movements,
C. Explainability of the FatigueNet
With the GRAD-RAM, we computed the model prediction basis on gradient flow. A total of six maps were computed since the FatigueNet consisted of six blocks of convolutional layers. For instance, in Table 1, Block 1 is the first convolution block of the FatigueNet, and Block 6 is the last convolution block of the FatigueNet. With the hierarchical design, the model represents the latent pattern of features in hierarchically. Reference [29] showed that the lower part layer represents the low-level concrete characteristic, and the higher layer understand the high-level abstract characteristic. Similarly, the FatigueNet represents hierarchical feature characteristics. Fig 3 shows the heatmap of the GRAD-RAM computed from the FatigueNet. Additionally, linear and angular movements,
Discussion
The RPE is a psycho physiological response that combines psychological and physiological factors [30]. For this reason, it is impossible to predict the RPE only with physiological factors, which are movement feature data, or vice versa perfectly. Fig 4 shows the specific RPE score error. The fold 2 performance of the FatigueNet in Stratified-CV is shown as a dotted line, which is MAE = 0.7794. Also, Fig 5 shows the confusion matrix of the fold 2 in Stratified-CV. In Table 1, the last layer of Block 7 in our FaigueNet is a fully connected layer. In terms of regression analysis, a confusion matrix is not appropriate. However, considering the discreteness of the RPE, the confusion matrix is drawn after calculating the result rounded up. In Fig 4 and 5, both sides of extreme values show relatively more errors than the medium range of the RPE. Especially, the FatigueNet makes the most errors in the higher range of the RPE such as 8 and 9. This discrepancy can be induced by psychological factors. Reference [14] found that the model underestimated the RPE on the game date when the RPE was higher than 7. However, four days before the game, the model overestimated when the RPE was lower than 4 [21]. In other words, on the event day, players reported more fatigue than was predicted by the model. However, four days to the game, the players reported less fatigue than the model predicted. Game induced psychological factors such as anxiety and tension would affect the RPE and fatigue. Using the integrated sessions, which are not labeled as personal ID, match, or training session ID, the FatigueNet learns latent patterns and the relationship between physical activities and fatigue. By doing so, the FatigueNet predicts objectified fatigue rather than a personal bias and daily mood. The objectified fatigue computed from spatio-temporal data using the FatigueNet can manage a team and individual physical and psychological conditioning. If several players repeatedly report the higher RPE than the FatigueNet predicted value, it could signify that the personal condition is not normal.
Even though we cannot ignore the effect of individual differences in critical parts of the RPE, each player has different physical composition, recovery ability, and endurance capacity. However, the FatigueNet ignored personal characteristics and investigated the general relationship between the movement feature data and RPE. In a future study, we are planning to develop an individualized version of the FatigueNet. The individualized FatigueNet would predict the RPE using movement features, personal body profiles, playing style and so on. Furthermore, the histories of players’ movements and the context of matches could be analyzed for more accurate predictions. We supposed that the RPE was a cumulative response of personal movement history rather than a single session’s spot response. An improved RPE prediction study used personal characteristics, contextual information about teams’ match results, and weekly or monthly players’ movement histories [21]. The prediction results were slightly enhanced than a study that analyzes a single session of external workload [14]. Finally, our FatigueNet could be expanded to the injury prediction model. To the best of our knowledge, not many studies were conducted to predict sports-related factors [15]. However, recent sports science demonstrated that fatigue and injury were closely related variables [31], [32], [33], [34]. Our FatigueNet had the potential to predict injury through players’ movement feature histories and individual properties based on the relationship between fatigue and injury. We could apply a semi-supervised learning approach to previously collected spatio-temporal data based on an effective fatigue prediction model. With semi-supervised learning, prediction accuracy would be improved. In our future study, we envision building a semi-supervised learning pipeline to reinforce our FatigueNet.
Conclusion
In sports, fatigue monitoring is a fundamental to maximizing performance and minimizing injury probability [31], [32]. In general, fatigue is measured by aggregated external workload and internal workload. The external workload is the amount of physical movement recorded using the EPTS. In contrast, the internal workload is related to the individual response to physical activity. The heart rate or VO2 max-based internal workload measurement was relatively complex than the self-report-based RPE measurement. The RPE is a simple and low-cost method of collecting the internal workload, making it a usual way to acquire the internal workload.
Despite the RPE’s advantage, it has limitations in the data collection procedure such as an individual bias, an independent environment, and fixed time interval. The critical weakness of the RPE is a psychological factor that could be latent. However, various papers demonstrated that the RPE was interrelated to external workload and other internal workload measurements. Several studies tried to predict the RPE via external workloads to minimize the effort in gathering the RPE data. In this way, the automatically generated RPE would be a more practical approach to monitor players’ fatigue. These studies used the aggregated features to predict the RPE. However, aggregated external workload losses details, for instance, sequential order or low-level patterns. Therefore, instead of aggregated external workload, we suggested movement feature data with minimally processed time sequences. Therefore, our FatigueNet was effectively predicting the RPE with MAE = 0.8494 ± 0.0557 and RMSE = 1.2166 ± 0.0737. With the GRAD-RAM, we could also generate a fatigue report that visualizes time of interest affecting fatigue accumulation and discriminative regions toward the output. Due to the GRAD-RAM, fatigue analysis could be extended to locomotion or event level investigation such as long-lasting sprints, frequent jogging, rapid change in directions. With the hierarchical design of the FatigueNet, the user determines the essential characteristics the model should focused on.
Previous researches used aggregated external workload metrics to predict the RPE. A particular EPTS vendor provided these handcrafted features, such as player load or fatigue index, because several metrics were exclusive and could not be freely accessed [35], [36]. Additionally, handcrafted features lose detailed low-level information and sequential order. Our proposed FatigueNet, based on minimum processed GPS data, has the following advantages. First, the FatigueNet uses 2-dimensional data, making it is easy and non-exclusive to generate feature data. Second, the FatigueNet effectively predicts with a small number of features. Third, the FatigueNet could contain sequential order of raw GPS data. Finally, the FatigueNet could maintain the detailed pattern of the low-level feature. We could come to a conclusion that the performance of the FatigueNet outperformed other machine learning models.
ACKNOWLEDGMENT
The authors would like to thank Jaehong Lee, a physical coach at Korea National U-23 Soccer Team, for data acquisition and management. He let the authors know about the importance of the RPE for monitoring soccer players so that they were able to set a goal, the fatigue prediction, and start their work. Also, he encouraged players to respond to a survey while putting emphasis on predicting the RPE. They would also like to express appreciation to Junggi Hong, a Professor at CHA University, for his advice in sports medicine. He gave instructions for analyzing the relationships between fatigue, external, and internal workloads and also helping them to derive movement features. Thanks to his insights about various activities giving rise to players’ injuries, they were able to successfully propose a fatigue prediction model.