홈으로ArticlesAll Issue
ArticlesA Novel Neural Calculation Model for Evaluating the Performance of Sanda Athletes
  • Lei Huang1, Xinao Li2, and Guixian Wang1,*

Human-centric Computing and Information Sciences volume 12, Article number: 56 (2022)
Cite this article 2 Accesses
https://doi.org/10.22967/HCIS.2022.12.056

Abstract

The application of machine learning to Sanda could have enormous benefits. This cross-disciplinary study aims to study machine learning in post-match scoring by combining comprehensive evaluation and machine learning. Building a scoring model is also a critical step in the post-match scoring process of athletes. The accuracy and reliability of scoring can be improved by using a good scoring model, thus increasing the practical value of scoring. This paper proposes a regression algorithm-based model for model construction. When fitting expert scores, the model makes use of athlete statistics extracted from a training set. Using the athlete statistics in the testing set, the model is tested and compared to expert data after it has been trained. This paper has an additional benefit in that it attempts to combine clustering and regression: first, athletes are grouped by type, and then training is done using a dataset that contains only those athletes, after which different regression models are built for each category. The combined use of clustering and regression improves performance. This work attempts to evaluate athlete scoring and makes a prediction about the outcome of a competition. The basic scoring model can obtain 1.45, 1.35, and 1.23 root mean square error (RMSE) on three subsets, and is less time consuming than other methods.


Keywords

Comprehensive Evaluation, Metric Learning, Athlete Scoring, Competition Prediction


Introduction

The use of machine learning to study the post-match scores of Sanda athletes constitutes a cross-disciplinary study combining comprehensive evaluation and machine learning. Generally speaking, various evaluation and scoring problems are collectively referred to as comprehensive evaluation problems. In life, people often rely on their own experience or cognition to make a comprehensive evaluation of all manner of things, which in turn subtly influences their future choices. Comprehensive methods of evaluation are becoming increasingly common, while application scenarios are attracting ever more attention. Methods of evaluation have gradually developed from simple subjective evaluations and expert evaluations into objective evaluations based on objective data and a combination of subjective and objective evaluations. In terms of application scenarios, with the rise of the online-to-offline (O2O) model and big data, forming a complete and objective comprehensive evaluation system, and guide users to compare and consume, it can be said that a comprehensive and objective comprehensive evaluation system is the core competitiveness of this type of enterprise. Since there are usually many factors that affect the evaluation of things, if one only relies on a certain index to evaluate things, one can only obtain a one-sided evaluation. Therefore, it is necessary to refine and aggregate the diverse factors which affect the evaluation of things in order to form a comprehensive index system. Such a method of obtaining a comprehensive and evaluation of things is referred to as a multi-index comprehensive evaluation method, and is also known as a comprehensive evaluation [15]. Comprehensive evaluation issues have always received attention from various related fields. Through decades of development, comprehensive evaluation methods suitable for different scenarios have reached dozens of scales. With the advance of computer technology and data processing capabilities, increasingly complex models and algorithms are typically being used in comprehensive evaluations, particularly various machine learning algorithms [614]. Sanda (martial arts), a mainstream sport, has always had a huge influence on the global economy and life. With the ongoing commercialization and professionalization of Sanda, its economic value has become increasingly important. With the increasing attention paid to Sanda in recent years, the popularity of football in the world has rapidly increased, and it has also produced great economic value. In the field of Sanda, the post-match scoring of athletes has always been an indispensable part, and scoring can play a role in the following aspects. First, in terms of its promotional role, post-match scoring is very important in Sanda. The scores of athletes are often used by fans when talking about a match after dinner. Both the media and the general public evaluate the athletes in a game, often on basis of media ratings. Second, in terms of athlete management, post-match scoring can be utilized in the management of athletes, for example by using athletes’ scores as the basis of their performance in order to implement a reward and punishment system; in addition, management teams can use this index to evaluate athletes’ status. Third, post-match scoring can be utilized in the guide management’s decision-making, including targeted training, contract negotiations, etc. Fourth, in terms of competition prediction, athletes’ post-match scores can be used to evaluate their current state, and can also serve as the main factor in predicting the outcome of a game or match, etc. This work combines machine learning with athletes’ post-match scoring, and is carried out from two aspects: construction of a scoring model and application of the scoring. First, in terms of model construction, a fitting model based on a regression algorithm, which uses statistical data in the training set to fit expert scores, is proposed, and the statistics of the athletes in the test set are used to test and compare them with expert data. The experimental results show that the scores generated by this model are closer to the expert scores. In addition, this study optimizes the general regression model, and proposes the idea of combining clustering and regression: first, athletes are clustered according to their types, and then training in the dataset of the same type of athletes. Constructing regression models for each category can improve the effect of the model. In terms of scoring application, this study mainly introduces the use of Sanda athletes’ post-match scores in match prediction, and suggests using athletes’ scores to predict the outcome of a game or match, etc. The contributions of this work are summarized as follows. (1) In view of the large gap between mainstream machine scoring models and expert scoring, this study proposes a scoring model that uses regression methods to fit expert scoring, which narrows the gap between the computer scoring model and the expert scoring model. The common regression model is optimized, and the idea of fusing clustering and regression is designed. (2) In order to overcome the shortcomings of the win-loss relationship in predicting the outcome of a game, etc., this study proposes the idea of using the score to predict the outcome, and includes the associated experiments. The experimental results are superior to the predictions obtained using the win-loss relationship.


Related Work

There currently exists a wide range of comprehensive methods of evaluation, which can be combined or crossed to create a new method. If ones look at a single comprehensive method, one can categorize it into the following five categories: evaluation methods based on gray theory; evaluation methods based on fuzzy mathematics and rough set theory; and evaluation methods using data envelopment analysis, structural equation models, and machine learning [1521]. With the advancement of computer technology, a comprehensive evaluation method has emerged. For large data sets with a large number of indicators, this method takes advantage of the computer's processing ability, giving it an unbeatable advantage over other approaches. Furthermore, the machine learning method does not necessitate any prior expertise. Data can be analyzed and mined more objectively, and the implicit connections in the data can be found by learning knowledge from the data. Support vector machines were used by Sun et al. [22] to assess power plants' overall competitiveness. For a systematic evaluation of land quality, Wang et al. [23] employed regression models. For a long time now, neural networks and deep learning have dominated the field of comprehensive evaluation, and this trend has continued. An evaluation of the effectiveness of business intelligence systems was conducted by Yan et al. [24] using backpropagation (BP) neural networks and comparing them to traditional methods, which proved the scientific nature and effectiveness of the BP neural networks. Image aesthetics evaluation was studied by Wang et al. [25] with the help of deep learning. These researchers used a convolutional neural network to automate feature extraction, and then verified their findings by using the old-school manual feature extraction method. In recent years, as information technology has advanced, sports analysis and prediction research have both benefited from applications of the concept of machine learning. Much data generated during training, and management is enormous, especially in sports like football and basketball. The collection of various types of data has become richer due to the development of various data collection technologies, and professional teams and institutions are also paying increasing attention to data. Several professional leagues and teams have chosen to work together to promote and analyze themselves using comprehensive sports data collected by professional sports data companies. Sports data can greatly assist a team in such areas as recruitment, training and management among others. Data mining and analysis using machine learning methods has become increasingly popular and effective in recent years. Indeed, methods based on machine learning are faster, more objective, and better able to analyze large amounts of data. To help a sports team in a scientific way, data analysts use professional techniques to quantify intuitive technical details and shooting habits that cannot be measured. As a result, some NBA (National Basketball Association) teams have incorporated the findings of this study into their practices. As a result of Alpha Go's sensational results in chess, more machine learning experts and academics have turned their attention to sports. With the goal of enhancing the impact of sports analysis research, SIGKDD, the top data mining conference, has launched large-scale seminars for three consecutive years using data mining and machine learning techniques. In addition, Zimmermann et al. [26] used classifiers to predict NBA games, and presented a summary in which features are often more important than models. Hamilton et al. [27] used common classification methods to analyze batting behavior in baseball and to help a team devise the corresponding batting strategies. Sinha et al. [28] studied the connection between social media output and the NFL (National Football League) game through text information mining, and combined it with game statistics to predict the outcomes of future games. This work is based on a traditional model that only relies on game statistics. Put forward and presents the idea of mining off-field factors, which is very enlightening regarding the prediction of game outcomes.


Realization and Improvement of the Post-match Scoring Model

Basic Scoring Model
The flow of the basic scoring network (SC) is illustrated in Fig. 1. After completing the data, it is necessary to select an appropriate method of optimizing the model in order to ensure its effectiveness.

Fig. 1. Basic scoring model flow chart.


BP's nonlinear mapping ability, self-learning capability and adaptability, generalization capability, and fault tolerance make it an excellent choice for machine learning, as it obtains good results with different scenarios and data, so first choose BP neural network as the machine learning in the process method to generate a scoring model, and test and improve it. The proposed BP neural network is illustrated in Fig. 2.

Fig. 2. Structure of the BP neural network.


The various characteristics of athletes are represented by the seventy nodes in the input layer, while the output layer has one node, and the hidden layer has an additional eight nodes, which can be formulated as follows:

(1)


The nodes in the input layer are represented by m, and those in the output layer by n, while a is a constant. The learning rate α is set to 0.01, and the maximum training epoch is 1000. Because of the perceptron and nonlinear optimization capabilities of the BP neural network, its calculation accuracy can approximate any nonlinear function. But in terms of its practical applications, it also has shortcomings and needs to be improved: (i) The number of training times is large, and the convergence speed is slow. For a common problem, it takes a lot of learning and training to achieve convergence. For more complex problems, the training time will be longer. (ii) It is easy to fall into the local optimum, and hence the global optimum cannot be guaranteed. Because the BP learning algorithm uses the gradient descent method to update the weights, the connection weight space is a hypersurface with multiple minimum points. Starting from a certain starting point, reaching the minimum value of the error along the slope of the error function may lead to different minimum values without an optimal solution.
To solve the above-mentioned issues of the BP neural network, this paper uses the Levenberg-Marquardt (LM) algorithm to optimize the network. A local pole may be the Newton direction in Newton's algorithm when the Hessian matrix is not positive-definite. To make the Hessian matrix positive-definite, a positive definite matrix can be added to it. Gradient descent and the Gauss-Newton method combine organically in the LM algorithm. In comparison to the gradient descent method, the LM algorithm's convergence time is much faster and has a higher level of stability, as well as being more efficient. In the BP network, the loss is calculated as follows:

(2)


With backpropagation, the new updated weight is:

(3)


For Newton’s algorithm, the amount of change is calculated as follows:

(4)


To make all the Hessian matrices invertible, it is necessary to approximate them:

(5)


where J(x) is the Jacobian matrix.
The LM algorithm can improve the Gauss-Newton method, and the improved weight and threshold adjustment rule is as follows:

(6)



Divide and Regression Model
Given that there are different categories of Sanda moves, such as boxing, legging, and defense, the different types of movements of the same Sanda athletes lead to a certain difference in the importance of their characteristics, especially those for whom there is a large gap between their offensive and defensive abilities. For example, Sanda athletes with a strong offensive ability should have stronger boxing and legging skills, whereas Sanda athletes with a strong defensive ability should be more accomplished in defense. In the ordinary BP neural network, however, all categories are treated equally, which easily leads to a large deviation in the scoring system. Therefore, this study tries to divide the training set according to the different action categories, and performs modeling training respectively, and then divides the test set according to the different action categories, and brings them into different models for testing. This improved method is called the “division and regression” (DR) model. The flowchart of the model comprising division and then regression is shown in Fig. 3.
First, the full dataset after matching is divided according to the action classification labels, and each action label is divided into different datasets, such as boxing, legging, defense and other datasets. Then, the training set and testing set are divided for each action, and the regression algorithm is used to build a model. Finally, combine the prediction score data of each model on the test set is combined to calculate the final score.

Fig. 3. DR model.


Divided Feature Selection and Regression Model
is actually inferior to the effect of not performing feature selection. After dividing the action, the comparison of athletes’ same action is closer, so the importance of the feature is relatively stable, which indicates that it is a reasonable method for selecting the feature after the action has been divided. Through feature selection, each feature can be tested, the correlation between each feature and the response variable can be measured, and the bad features can be discarded according to the score.
Considering that the input data and output data of athletes’ scores are close to a linear relationship, the linear Pearson correlation coefficient is used as the method of feature selection [29]. As a simple method of selecting features, Pearson correlation coefficient can reflect the connection from univariate features to response variables. It is used to measure the linear correlation. The range of values is from -1 to 1. Generally, the absolute value of Pearson coefficient is used to reflect the correlation between univariate characteristics and response variables. Absolute values near 1 have a higher correlation, whereas values near 0 have a lower correlation. The Pearson coefficient is fast and entails a small amount of calculation, and is generally executed before model training, as follows:

(7)


where A and B are the two variables that are needed to calculate Pearson coefficient, and E represents the mathematical expectation.
Based on the model of division followed by regression, the Pearson coefficient is used for feature selection on the training set of each action division, and then ridge regression is used for training, which is called the “divided feature selection and regression” (DFSR) model. Using ridge regression, the over-fitting issue in the data can be resolved. Ridge regression incorporates the L2 norm ‖W‖^2 into the loss function of the least square regression. Instead of relying on the least square regression's unbiasedness, it can be used to improve the model's generalization capabilities while also avoiding the problem of over-fitting. The loss function of ridge regression can be formulated as follows:

(8)


where the last item is L2 norm, $θ_j$ is the parameter vector's $j_{th}$ element, and λ is the loss coefficient. The calculation of the parameter vector is illustrated in the following formula:

(9)


The divided feature selection and regression model is illustrated in Fig. 4. Compared with the PR model in the previous section, the feature selection step is added before regression to the training set.

Fig. 4. Divided feature selection and regression model.


Clustering and Regression Model
In the previous division of athletes’ movements, two problems were encountered. The first problem is that the functions and characteristics of athletes in the same action are different. The second problem is that some athletes with different actions have the same functions and characteristics in the Sanda field. As a result, the method of division has been replaced by the clustering method. The use of clustering and regression together in one model (CR). Different athlete data are gathered using clustering in order to train the model. Fig. 5 depicts the clustering and regression model. To put it another way, instead of using a division process to create clusters, this CR model utilizes a clustering process.

Fig. 5. Clustering and regression model.


Although there is a variety of clustering algorithms, two of the most widely used and fundamental similarity metrics are Euclidean distance and cosine similarity. The Euclidean distance is the simplest and most widely used metric for determining how similar two things are. The similarity between two points can be described by calculating the distance between them in Euclidean space. Also known as the “cosine distance”, cosine similarity measures the distance between two points. To find the degree of similarity between two people in a given space, the algorithm calculates the cosine value, and then uses that value to determine how much the vectors differ.
The formula for calculating the Euclidean distance and cosine similarity is as follows:

(10)


(11)


where $x_{ik}$ is the feature vector.
The calculation and measurement characteristics of Euclidean distance and cosine similarity differ, making them suitable for different data analysis models. They are mainly used in analyses that need to reflect the difference in size of the characteristic value that the Euclidean distance reflects the superposition of numerical differences between individuals and also reflects the absolute differences of individual numerical characteristics. However, because it is not sensitive to absolute values, cosine similarity is well-suited for analyses that require data distribution characteristics to find differences between individuals. In order to improve the effect of clustering and avoid the influence of unimportant features on the clustering effect of too high feature dimensions, which is necessary to select features, it is first necessary to remove the irrelevant features, because the purpose of clustering is to gather different types of athletes, and also to pay attention to the prominent features of athletes; therefore, features of low importance are removed.


Prediction of Match Results Based on Athlete Rating

The scores of Sanda athletes are closely related to their performance in a game. After obtaining the accurate scores of athletes, they can be used in many areas of Sanda sports. For example, the score of an athlete reflects his or her ability and physical state to a certain extent, so an analysis of changes in an athlete’s score over a given period of time can reflect fluctuations in the athlete’s state or ability, which in turn affect the athlete’s worth. Therefore, this can be used to predict the value of athletes. In addition, based on the scores of athletes, the underlying trend of changes in athletes’ strength and status can also be analyzed in order to predict the results of a competition. This chapter mainly introduces the application of athlete scoring to the prediction of competition results. Using athlete scores to predict game results is more accurate than using historical records. Sometimes, an athlete’s victory or defeat does not depend on their strength. A player who performs well can nonetheless lose due to bad luck or refereeing, and victory or defeat in a game does not fully reflect an athlete’s strength.

SportsNetRank
In the field of sports, research on the prediction of competition results is in full swing in other countries. Using the Internet to evaluate the strength of athletes or predict the results of competitions has become a popular method in recent years. Park and Newman [30] proposed a network-based team ranking system to predict the rankings of American college football. Govan and Meyer [31] proposed the use of Google’s PageRank algorithm to analyze the team rankings of the NFL. Chen et al. proposed a method of using context to analyze results in the sports field and the performance of the opponents [32]. At the 2016 KDD Large-Scale Sports Analytics conference, Pelechrinis et al. [33] proposed an improved method of PageRank to establish a network SportsNetRank and, on this basis, the ranking of NFL league teams and the prediction of match results. SportsNetRank added the concept of random walk to the basis of PageRank. Unpopularity in a game conforms to the definition of random walk, and the model established by random walk is more in line with the actual situation.
In this paper, the Sanda League is taken as an example to generate the SportNetRank network. The segment of the SportNetRank network generated in this study is shown in Fig. 6. Assuming that it is necessary to predict the game, all the results of the first round to the n–1th round are used to establish the network. The directed weighted network is established based on the performance of each game between the Sanda players. If player A loses to player B in a certain game, the A-B edge is added to the network, and the weight is the absolute value of the difference. If Athlete A loses to Athlete B, which means that Athlete A is weaker than Athlete B, then Athlete A needs to contribute certain points to Athlete B in the network.

Fig. 6. Example of the SportNetRank network.


Use the results of the matches that have occurred to establish a network, and then use the formula to calculate the athlete’s score. The calculation formula of the athlete’s score is shown as follows, and the specific derivation process is taken from [34]:

(12)


where is the team score vector, A is the adjacency matrix of the network, D is the diagonal matrix, α is the random probability, and β is the initial vector.
Use the generated athlete score vector to predict the result of the game. For the upcoming nth round of a competition, if the score of player A in the vector π is greater than that of player B, then A will win; otherwise B will win.

Predictive Model of Match Result based on Athlete Rating
Because luck plays a role in Sanda matches, and the decisions of referees often affect the results, the scores may not truly reflect the strength of the athletes. As using the differences in the scores of historical match results to predict the outcome of an upcoming match has certain deficiencies, a better method is needed to evaluate the athletes in order to predict the results of a competition based on an analysis of the strength of the athletes. Considering the evaluation of Sanda players in the previous chapter, this work combines the scores obtained from the divided feature selection and regression model with those obtained from the clustering and regression model, and then utilizes the deep convolutional neural network to extract the detailed information to construct the rank compare network (RCN), which can predict the match result. The structure of the RCN is shown in Fig. 7.

Fig. 7. Structure of the RCN.


The RCN took the feature indexes of Sanda athletes as the input, and utilized both the divided feature selection and regression model and the clustering and regression model to generate two kinds of score. Then, vector expansion of these two scores was performed and two convolution layers were embedded into the pipeline to extract the shallow features. Next, a concat layer was used to concatenate two features of score, and the concatenate feature was sent to the ResNet-50 deep convolutional neural network to diming deep feature. However, assuming that there may be temporal continuity between these features, an LSTM module was added to generate the final boxer score.

(13)


where $X_{t,i}$ is the true value result and $X_{p,i}$ is the predicted value result.

Evaluation of the Basic Scoring Model
To verify the effectiveness of the basic scoring network, the training loss and testing performance are illustrated in Fig. 8. On the three data subsets, the training loss gradually decreases, and finally the network reaches the convergence state after about 40 epochs. And the larger the dataset, the greater the overall loss of the network. When testing the performance of the network, the initial RMSE of the initial iterative training increases with the dataset’s expansion; however, as the training iteration progresses, and after the final network reaches convergence, the larger the amount of data in the dataset, then the smaller the corresponding RMSE that can be obtained, and the better the scoring performance of the network. This is really because a larger amount of data can induce the network to learn deeper features, so that it can better predict the scores of Sanda athletes when testing and fitting.

Fig. 8. Training loss and testing performance: (a) training loss and (b) testing RMSE.


As mentioned above, the LM algorithm was applied to our proposed network to verify the correctness of the LM, and a comparative experiment was conducted, the results of which are presented in Table 1. With the introduction of the LM algorithm, the RMSE was reduced to a certain extent on the three sub-datasets as follows: The RMSE was observed to drop by 0.18; 0.19, and 0.14 on the small, medium and large datasets, respectively, thus confirming that the combination of the LM algorithm and the BP algorithm can improve the performance.

Table 1. Evaluation of the LM algorithm
Method RMSE
Small Medium Large
BP 1.63 1.54 1.37
BP+LM 1.45 1.35 1.23


Evaluation of the Improved Regression Model
This study proposes three different improved regression models, namely, the DR, DFSR, and CR models. To find the most improved model, we conducted corresponding comparative experiments to compare the RMSE of these three models on three sub-datasets. In addition, time consumption needs to be considered. If the performance of the model is good, but the time consumption is large, the desirability of the model is also insufficient. The experimental results are presented in Fig. 9.

Fig. 9. Testing performance and time consumption: (a) testing RMSE and (b) testing time.


In each sub-dataset, the RMSE of the DR model showed the largest improvement, followed by the RMSE of the DFSR model, while the RMSE of the CR model showed the smallest. Thus, the performance of the three improved models was gradually improved. Compared with the original basic network model, each improved method can obtain a corresponding performance improvement, of which the CR model showed the largest improvement. Furthermore, it can be seen from the time consumption graph that the amount of time consumed remained the same when testing the same model on different sub-datasets, whereas it differed when testing different models on the same sub-dataset. For every thousand pieces of data tested, the time consumption of the DR, DFSR and CR models was 3.1 seconds, 3.3 seconds, and 3.4 seconds, respectively. This increase in time consumption is very small and basically negligible. Therefore, by integrating the RMSE performance and time consumption of the models, it can be concluded that the optimal network model is CR. This also shows that the introduction of clustering ideas into neural networks can effectively improve network performance.

Comparison with Other Regression Methods
To highlight the advanced nature of the proposed network, the CR model is compared with other regression algorithms, including the ridge regression (Ridge), random forest regression (RFR) and linear support vector regression (LinearSVR) algorithms. The results of the comparative experiment are presented in Table 2.
As can be seen from Table 2, compared with the other three regression methods, the RMSE decreased by 0.28, 0.34, and 0.22, respectively, on the small dataset; by 0.28, 0.41, and 0.19, respectively, on the medium dataset; and it dropped to 0.36, 0.47, and 0.25, respectively, on the large dataset. From this, it can be concluded that the CR model proposed in this paper corresponds to the smallest RMSE, and thus the best scoring performance can be obtained using that model.

Table 2. Comparison with other regression methods
Method RMSE
Small Medium Large
Ridge 1.39 1.28 1.21
RFR 1.45 1.41 1.32
LinearSVR 1.33 1.19 1.1
CR 1.11 1 0.85

Evaluation of the Rank Compare Network
This paper proposes an RCN based on the Sanda athlete rating network, which is used to predict the outcome of a Sanda match. This network model takes the predicted scores of the two regression models as the input, embeds the deep learning network to extract the deep features, obtains the comprehensive score of the Sanda athletes, and then compares the comprehensive scores of the two Sanda athletes to predict the outcome of the game. In order to verify the effectiveness of the RCN, the accuracy performance and time consumption of the RCN network were compared with those of the mainstream SportNetRank network, the results of which are presented in Fig. 10.

Fig. 10. Testing performance and time consumption: (a) testing accuracy and (b) testing time.


On the three sub-data sets, the proposed RCN method obtained levels of accuracy of 91%, 93%, and 94%, respectively. Thus, compared with the method A, the rate of improvement is 6%, 6%, and 4%, respectively. In terms of time consumption, for every thousand pieces of data, the SportNetRank method takes 4.7 seconds, while the RCN method takes 4.9 seconds. As the extra time consumption for each piece of data is basically zero, it can be confirmed that the proposed RCN method offers greater efficiency and reliability.
As mentioned above, when RCN is used to predict the result of a Sanda match, the threshold t is set. In order to verify the impact of the different t values on RCN performance and search for the best t value, this study selected different t values for the purpose of comparison, and tested the accuracy of t from 0 to 0.1 (step 0.01), the results of which are shown in Fig. 11.
When t takes different values, the performance of the model is also different. On the three different sub-datasets, the trend of the curve is basically the same. That is to say, when t gradually increases, the accuracy of the model gradually rises to the peak, and the t value corresponding to the peak is equal to 0.4; then, as the t value further increases, the accuracy of the model gradually decreases. This is because a suitable t value can effectively improve performance, whereas an excessively large t value will introduce additional errors, thereby degrading the performance of the model.

Fig. 11. Testing performance and time consumption: (a) testing accuracy and (b) testing time.



Conclusion

This paper examines post-match scoring in conjunction with machine learning algorithms and research in three areas: model construction, optimization, and scoring. First, this study provides background information on the field's research history and current status. It then proposes a regression-based post-match scoring model for Sanda athletes using the athletes’ statistics to tailor the prediction results to their actual performance. The experimental evidences suggest that this model's impact is more precise than that of other widely used computer scoring models. This study also proposes some enhancements based on the regression model. According to the results of the research, the combined use of clustering and regression yielded the highest level of accuracy. Finally, it proposes that post-match scores from athletes be used to forecast competition outcomes. The use of post-match scores by athletes to forecast their results is superior to the use of historical records. Future research will include an attempt to combine deep learning with the Sanda athlete model. The combination of neural network and regression can aggregate their advantages, which can help to further improve the performance of the model.


Author’s Contributions

Conceptualization, HL, WG. Investigation and methodology, HL, LH, WG. Resources, HL. Supervision, HL, WG. Writing of the original draft, HL, LH, WG. Writing of review and editing, HL, LH. Validation, HL, WG. Data curation, HL, LH, WG.


Funding

None.


Competing Interests

The authors declare that they have no competing interests.


Author Biography

Author
Name : Lei Huang
Affiliation : China Wushu School, Beijing Sport University
Biography : Lei Huang is studying in China Wushu School, Beijing Sport University. His research interests include Wushu and Sanda. He is World Sanda Champion (International master grade), a Sanda teacher of Beijing Sport University of physical education; The 90kg champion of the 5th Wushu Sanda World Cup in 2010; The 90kg champion of the 11th World Sanda championship in 2011; The 90kg champion of the 6th Wushu Sanda World Cup in 2012; From 2006 to 2012, he had won the National Wushu Sanda championship for 6 times.

Author
Name : Xinao Li
Affiliation : Sanda Team, Beijing Shichahai Sports School
Biography : Xinao Li is studying in Beijing Shichahai Sports School. His research interests include Wushu and Sanda.

Author
Name : Guixian Wang
Affiliation : China Wushu School, Beijing Sport University
Biography : Guixian Wang is studying in China Wushu School, Beijing Sport University. Her research interests include Wushu and Sanda. The honors she has won include the women's 60kg champion of Sanda in the seventh Asian Wushu Championship; The champion of Sanda 60kg in the 11th National Games in 2009; The champion of Sanda in the 10th World Wushu Championship in 2009; The champion Sanda in the fifth World Cup in 2010; The champion of Sanda 60kg in the first world martial arts Expo in 2010; From 2007 to 2010, she had won the national Sanda Championship for 8 times.


References

[1] P. Li and G. Yu, “Survey on the multi-index comprehensive evaluation method,” Development & Innovation of Machinery & Electrical Products, vol. 22, no. 4, pp. 24-28, 2009.
[2] X. Q. Wang and J. Yin, “Application of machine learning in safety evaluation of athletes training based on physiological index monitoring,” Safety Science, vol. 120, pp. 833-837, 2019.
[3] M. T. Worsey, H. G. Espinosa, J. B. Shepherd, and D. V. Thiel, “One size doesn't fit all: supervised machine learning classification in athlete-monitoring,” IEEE Sensors Letters, vol. 5, no. 3, article no. 7000904, 2021. https://doi.org/10.1109/LSENS.2021.3060376
[4] S. A. Khowaja, B. N. Yahya, and S. L. Lee, “CAPHAR: context-aware personalized human activity recognition using associative learning in smart environments,” Human-centric Computing and Information Sciences, vol. 10, article no. 35, 2020. https://doi.org/10.1186/s13673-020-00240-y
[5] J. Pei, K. Zhong, J. Li, J. Xu, and X. Wang, “ECNN: evaluating a cluster-neural network model for city innovation capability,” Neural Computing and Applications, vol. 34, no. 15, pp. 12331-12343, 2022.
[6] J. D. Bartlett, F. O’Connor, N. Pitchford, L. Torres-Ronda, and S. J. Robertson, “Relationships between internal and external training load in team-sport athletes: evidence for an individualized approach,” International Journal of Sports Physiology and Performance, vol. 12, no. 2, pp. 230-234, 2017.
[7] K. Zhong, Y. Wang, J. Pei, S. Tang, and Z. Han, “Super efficiency SBM-DEA and neural network for performance evaluation,” Information Processing & Management, vol. 58, no. 6, article no. 102728, 2021. https://doi.org/10.1016/j.ipm.2021.102728
[8] C. Hoog Antink, A. K. Braczynski, and B. Ganse, “Learning from machine learning: prediction of age-related athletic performance decline trajectories,” GeroScience, vol. 43, no. 5, pp. 2547-2559, 2021.
[9] J. Wei, “Research on athlete's action recognition in sports video based on supervised learning algorithm,” Information Technology, vol. 2018, no. 8, pp. 111-114, 2018.
[10] Y. Wang, “Real-time collection method of athletes’ abnormal training data based on machine learning,” Mobile Information Systems, vol. 2021, article no. 9938605, 2021. https://doi.org/10.1155/2021/9938605
[11] D. Cao, Z. Chen, and L. Gao, “An improved object detection algorithm based on multi-scaled and deformable convolutional neural networks,” Human-centric Computing and Information Sciences, vol. 10, article no. 14, 2020. https://doi.org/10.1186/s13673-020-00219-9
[12] D. D. Liu, “Performance evaluation model of Wushu Sanda athletes based on visual signal processing,” in Multimedia Technology and Enhanced Learning. Cham, Switzerland: Springer, 2021, pp. 103-116.
[13] M. J. J. Ghrabat, G. Ma, I. Y. Maolood, S. S. Alresheedi, and Z. A. Abduljabbar, “An effective image retrieval based on optimized genetic algorithm utilized a novel SVM-based convolutional neural network classifier,” Human-centric Computing and Information Sciences, vol. 9, article no. 31, 2019. https://doi.org/10.1186/s13673-019-0191-8
[14] A. Souri, M. Y. Ghafour, A. M. Ahmed, F. Safara, A. Yamini, and M. Hoseyninezhad, “A new machine learning-based healthcare monitoring model for student’s condition diagnosis in Internet of Things environment,” Soft Computing, vol. 24, no. 22, pp. 17111-17121, 2020.
[15] G. Xu, Y. P. Yang, S. Y. Lu, L. Li, and X. Song, “Comprehensive evaluation of coal-fired power plants based on grey relational analysis and analytic hierarchy process,” Energy Policy, vol. 39, no. 5, pp. 2343-2351, 2011.
[16] Y. Icaga, “Fuzzy evaluation of water quality classification,” Ecological Indicators, vol. 7, no. 3, pp. 710-718, 2007.
[17] C. Lee, H. Lee, H. Seol, and Y. Park, “Evaluation of new service concepts using rough set theory and group analytic hierarchy process,” Expert Systems with Applications, vol. 39, no. 3, pp. 3404-3412, 2012.
[18] P. Guo and H. Tanaka, “Fuzzy DEA: a perceptual evaluation method,” Fuzzy Sets and Systems, vol. 119, no. 1, pp. 149-160, 2001.
[19] K. Wang, S. Yu, and W. Zhang, “China’s regional energy and environmental efficiency: a DEA window analysis based dynamic evaluation,” Mathematical and Computer Modelling, vol. 58, no. 5-6, pp. 1117-1127, 2013.
[20] A. P. Balcerzak and M. B. Pietrzak, “Structural equation modeling in evaluation of technological potential of European Union countries in the years 2008-2012,” in Proceedings of the 10th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, Zakopane, Poland, 2016.
[21] I. Chenini and S. Khemiri, “Evaluation of ground water quality using multiple linear regression and structural equation modeling,” International Journal of Environmental Science & Technology, vol. 6, no. 3, pp. 509-519, 2009.
[22] W. Sun, H. Y. Shen, and C. G. Yang, “Comprehensive evaluation of power plants' competition ability with SVM method,” in Proceedings of 2006 International Conference on Machine Learning and Cybernetics, Dalian, China, 2006, pp. 3568-3572.
[23] M. Wang, R. Beelen, M. Eeftens, K. Meliefste, G. Hoek, and B. Brunekreef, “Systematic evaluation of land use regression models for NO2,” Environmental Science & Technology, vol. 46, no. 8, pp. 4481-4489, 2012.
[24] S. L. Yan, Y. Wang, and J. C. Liu, “Research on the comprehensive evaluation of business intelligence system based on BP neural network,’ Systems Engineering Procedia, vol. 4, pp. 275-281, 2012.
[25] W. Wang, M. Zhao, L. Wang, J. Huang, C. Cai, and X. Xu, “A multi-scene deep learning model for image aesthetic evaluation,” Signal Processing: Image Communication, vol. 47, pp. 511-518, 2016.
[26] A. Zimmermann, S., Moorthy, and Z. Shi, “Predicting college basketball match outcomes using machine learning techniques: some results and lessons learned,” 2013 [Online]. https://arxiv.org/abs/1310.3607.
[27] M. Hamilton, P. Hoang, L. Layne, J. Murray, D. Padget, C. Stafford, and H. Tran, “Applying machine learning techniques to baseball pitch prediction,” in Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods (ICPRAM), Angers, France, 2014, pp. 520-527.
[28] S. Sinha, C. Dyer, K. Gimpel, and N. A. Smith, “Predicting the NFL using Twitter,” 2013 [Online]. Available: https://arxiv.org/abs/1310.6998.
[29] F. Coelho, A. P. Braga, and M. Verleysen, “Multi-objective semi-supervised feature selection and model selection based on Pearson’s correlation coefficient,” ‘n Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Heidelberg, Germany: Springer, 2010, pp. 509-516.
[30] J. Park and M. E. Newman, “A network-based ranking system for US college football,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2005, article no. P10014, 2005. https://doi.org/10.1088/1742-5468/2005/10/P10014
[31] A. Govan and C. D. Meyer, “Ranking national football league teams using Google's Pagerank,” 2006 [Online]. Available: https://repository.lib.ncsu.edu/bitstream/handle/1840.4/385/crsc-tr06-19.pdf?sequence=1.
[32] S. Chen and T. Joachims, “Predicting matchups and preferences in context,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, 2016, pp. 775-784.
[33] K. Pelechrinis, E. Papalexakis, and C. Faloutsos, “SportsNetRank: network-based sports team ranking,” 2016 [Online]. Available: http://d-scholarship.pitt.edu/28337/.
[34] F. Fouss, K. Francoisse, L. Yen, A. Pirotte, and M. Saerens, “An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification,” Neural Networks, vol. 31, pp. 53-72, 2012.

About this article
Cite this article

Lei Huang1, Xinao Li2, and Guixian Wang1,*, A Novel Neural Calculation Model for Evaluating the Performance of Sanda Athletes, Article number: 12:56 (2022) Cite this article 2 Accesses

Download citation
  • Received8 October 2021
  • Accepted3 December 2021
  • Published15 December 2022
Share this article

Anyone you share the following link with will be able to read this content:

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords