ArticlesA Gesture Recognition Method Based on MIC-Attention-LSTM
- Ming-Jia Hu, Yu-Lin Gong*, Xiao-Juan Chen, and Bo Han
Human-centric Computing and Information Sciences volume 13, Article number: 21 (2023)
Cite this article 2 Accesses
https://doi.org/10.22967/HCIS.2023.13.021
Abstract
A gesture recognition method based on the maximal information coefficient attention-based long short-term memory (MIC-Attention-LSTM) algorithm was proposed to increase the accuracy of gesture recognition using high-density surface electromyography (HD-sEMG) and decrease the redundancy between HD-sEMG. The correlation number was used to reduce 10 time-domain features first, and then five features were chosen to create the best feature set. Next, MIC was employed to establish various reduction thresholds, divide various channel combinations, and determine the correlation between various signal channels. The best channel combination was chosen based on the classification accuracy of the final model, which was created by LSTM and Attention-LSTM. The classification results showed that the LSTM classification model achieved the highest classification accuracy of 87.27% and 89.91%, respectively, which were 1.41% and 1.71% higher than that without channel reduction, demonstrating the efficiency of the channel reduction method. Compared to the LSTM model, the classification accuracy of the Attention-LSTM model was 9.47% better after the feature and channel reduction of the sEMG was complete. This finding showed that the Attention mechanism algorithm could efficiently highlight the weight of key signal sequences and enhance the classification accuracy of LSTM.
Keywords
Gesture Recognition, Correlation Number, MIC, Attention Mechanism, LSTM
Introduction
Surface electromyography (sEMG) is a bioelectrical signal that can reflect the degree of muscle excitement. Currently, it is deeply used in the application research of human-machine interface [1], including human-machine interaction [2], prosthesis control [3], rehabilitation therapy [4], and other fields, and it is an effective signal control source. The key to realizing a human-machine interface based on sEMG is to realize automatic decoding and recognition of motion intention in different gesture execution processes using signal processing and pattern recognition technology [5].
The gesture recognition model based on sEMG can be divided into four modules: signal acquisition, signal preprocessing, feature extraction, and classification recognition. In the signal acquisition stage, sparse [6] and dense [7] sEMG can be obtained using different acquisition equipment. The former usually collects signals with a small number of electrodes and obtains sparse sEMG data, while the latter uses an array of electrodes to capture finer sEMG. Signal preprocessing is meant to improve the signal-to-noise ratio. Feature extraction is indispensable in gesture recognition, and commonly used features include time-domain [8], frequency-domain [9], time frequency-domain [10], and nonlinear features [5 However, with the increase of feature dimension and signal acquisition channels, information redundancy will be caused, increasing the computational burden [11]. Currently, commonly used dimension reduction algorithms include minimum redundancy maximum relevance (mRMR) [12], principal component analysis (PCA) [13], and Fisher score [14], among others.
Machine learning or deep learning methods such as feature extraction and classification technology can complete the construction of a gesture recognition model [15]. The model of gesture recognition can be built by machine learning or deep learning. To recognize gestures, machine learning uses classifiers like support vector machines (SVMs) [16], random forests [17], or linear discriminant classifiers [18] and then inputs feature parameters. Machine learning models have also been improved with the help of the genetic algorithm (GA) and other optimization algorithms [19]. When the number of gestures is small, machine learning methods can achieve good recognition results, but the recognition rate will inevitably decrease as the number of gestures increases. To improve the recognition rate of sEMG multi-gesture recognition, it is essential to develop a model that can automatically extract and classify sEMG features.
With the advancement of science, massive data have emerged, and deep learning can process massive data effectively [20]. Through the multi-layer network structure, it automatically learns advanced semantic features from data step by step, thereby realizing the end-to-end identification framework, and its powerful data fitting ability surpasses machine learning. Zhong et al. [21] designed a multi-feature fusion network to improve feature extraction capability and introduced the deep confidence network as a classification model to classify 12 types of rehabilitation actions, achieving the highest recognition accuracy of 86.1%. Convolutional neural networks (CNN) are widely used in image processing [22]. Liu and Wang [23] grayscale processed 32 features of sEMG in eight channels into feature maps, which were then used to train the convolutional neural network and achieved a classification accuracy of 98.1% for five gestures. The loss of signal timing information is due to the conversion of sEMG to a two-dimensional gray scale [24]. Therefore, Zhang et al. [25] utilized recurrent neural networks (RNN) to predict 21 gestures of 13 test subjects, with an accuracy of 89.6%. Sohane and Agarwal [26] utilized particle swarm optimization (PSO) to optimize the long short-term memory (LSTM) network and classify the three movements of buckling, extension, and slope walking, with an accuracy of 98.58%. However, the aforementioned deep learning methods failed to highlight the contribution of key data to classification accuracy and to account for the signal sequence characteristics between the front and rear sEMG.
In this paper, based on the current state of the art, we first extracted 10 types of time-domain features from sEMG, taking into account the characteristics between different degrees of redundancy and the mutual relationship between the 10 types of domain reduction. We then retained five time-domain features, mean absolute value (MAV), zero crossing (ZC), slope sign change (SSC), average amplitude change (AAC), and difference absolute standard deviation value (DASDV), as their solicitation and used them to extract the time-domain feature set. Since the sEMG data used in this study consisted of 128 channels of HD-sEMG data, adjacent channels contained redundant information. The maximal information coefficient (MIC) was used to calculate the correlation between different channels, and by setting different thresholds, various signal channel sets could be obtained. LSTM and Attention-LSTM were used to build classification models for the gesture. A comparison of the classification accuracy of different models revealed that the LSTM gesture recognition model had an accuracy of 87.27% and 89.91% when the channel reduction threshold was set to 0.69. In contrast to no channel reduction, the MIC channel reduction scheme increased by 1.41% and 1.71%, respectively, demonstrating the scheme's effectiveness. At the same time, after feature and channel reduction, Attention-LSTM achieved 95.33% classification accuracy, which was 9.47% higher than LSTM classification accuracy when feature and channel reduction were not completed, proving that the Attention mechanism algorithm could effectively highlight key data and improve LSTM classification accuracy.
Data and Methods
Database Introduction
The data used in this paper were from the CapgMyo database, an HD-sEMG database published by Professor Geng Weidong's research group at Zhejiang University [27]. In this database, the array differential electrode [28] with 128 channels (16 rows and nine rows of wet silver electrodes) was used to collect sEMG from 23 healthy subjects at a sampling frequency of 1,000 Hz. The CapgMyo database contains three subsets: DB-a, DB-b, and DB-c. DB-a includes eight finger gestures of 18 subjects. DB-b has eight gestures collected twice by 10 subjects in two different time periods, and the gesture set is the same as DB-a. The DB-c dataset contains 12 basic finger gestures of 12 subjects. CapgMyo contains 20 gestures, which is a subset of the NinaPro database [29]. These 20 gestures have the most common finger gestures in daily life. DB-a and DB-c datasets were selected as experimental data sources in this paper, and the gesture diagram is shown in Fig. 1.

Fig. 1. Schematic diagram of gestures: (a) 8 gestures and (b) 12 gestures.
Fig. 1 shows 20 gestures in the CapgMyo database, in which Fig. 1(a) corresponds to #13–20 gestures in the NinaPro database, and Fig. 1(b) corresponds to #1–12 gestures in the NinaPro database. In this paper, 20 gestures were selected to complete the comparison with the NinaPro database to verify the advantages of HD-sEMG in gesture recognition tasks.
Signal filter
sEMG is highly sensitive to noise [
30], and noise such as moving artifacts and cable artifacts will inevitably be introduced in signal acquisition [
31]. In this paper, a Butterworth bandpass filter of 20–250 Hz was introduced to complete the filtering processing of sEMG. As 50 Hz power-frequency interference was found in the sEMG, a 50-Hz band-stop filter was made to finish filtering the sEMG [
32].
Active section detection
The purpose of active segment detection of sEMG is to extract the sEMG during the corresponding action execution from the continuous data, abandon the non-action segment signal data, reduce the amount of signal data, and improve the classification rate of the model. Currently, the existing extraction methods of the sEMG active segment include the moving average method [
33], the standard deviation method [34], and the filter method [
35].
Because the calculation steps of these methods were cumbersome and the precise selection of the action threshold was required, this paper used the root mean square (RMS) curve combined with the threshold to extract action segment data. RMS is a time-domain feature calculation method of sEMG, as shown in Equation (1):
(1)
where N is the data segment data length.
sEMG is a non-stationary signal with strong time variability; therefore, it can be solved by copying and moving [
36]. This paper combines the sliding window to complete the extraction of active segment data. Through the sliding window, a complete sEMG can be divided into several signal segments of the same size. Each segment contains only a small amount of signal data, so these signals can be regarded as stable signals. The parameters of sliding windows include sliding windows and sliding steps [
37]. Hudgins B pointed out that to improve the real-time performance of the system, it needed to complete the classification of gestures within 300 ms [
38]. Meanwhile, to increase the stability of signal data output, this paper selected the stacking operation of sliding windows [
39] by setting the window length to 128 and the sliding step to 64. The specific sliding window segmentation scheme is shown in Fig. 2.
Fig. 2. Data interception by sliding window.
In the data segment with length L, the number of window movements is N, which can be calculated with Equation (2):
(2)
where floor stands for rounded down
Time Domain Features
There is a one-to-one correspondence between various gestures and sEMG, and sEMG has the following characteristics: the sEMG generated by the same movement in the same muscle is similar, and the sEMG generated by different movements in the same muscle is different. In order to further expand the difference between different signals, feature extraction of sEMG is required. Compared with the studies of Atzori et al. [
29] and Pizzolato et al. [
40], multiple time-domain feature sets and time-frequency domain feature sets in the SVM, random forests, and other basic equal performance in the machine learning classification accuracy, due to the time-domain features of the process of calculation is simpler; consequently, this paper selected 10 types of time-domain feature extracting feature sets. The specific calculation formula is shown in Table 1.
Table 1. Time-domain features
Feature name |
Calculation formula |
Simple square integral (SSI) |
 |
Integrated EMG (IEMG) |
 |
Mean absolute value (MAV) |
 |
Variance (VAR) |
 |
Root mean square (RMS) |
 |
Wave from length (WL) |
 |
Average amplitude change (AAC) |
 |
Difference absolute standard deviation value (DASDV) |
 |
Zero crossing (ZC) |
 |
Slope sign change (SSC) |
 |
In this paper, the feature extraction of active segment signals in each channel was completed by combining the sliding window and 10 time-domain features. Considering the varying degrees of redundancy among the features, the 10 features will be reduced in the future, and only the feature parameters with low similarity will be retained to prevent the use of redundant features [
41].
Feature Reduction
Correlation numbers can describe the correlation between variables [
42]. They can clearly depict the fuzzy process [
43]. The sample correlation coefficient is represented by r, and the sample population correlation coefficient is represented by ρ.
Formula (3) is the calculation formula of the total form of the correlation coefficient:
(3)
where Cov(X,Y) is the covariance of X and Y, Var(X) is the variance of vector X, and Var(Y) is the variance of the vector Y.
Calculating the covariance and variance between samples yields the sample correlation coefficient, as shown in Table 2. The correlation degree between samples can be divided into five grades [44].
The above feature reduction process mainly compares the correlation between various features by calculating the correlation coefficients of each feature parameter and then comparing the correlation intensity of the sample correlation coefficients in Table 2 to complete the reduction of the feature parameters in the time domain.
Table 2. Correlation degree of Pearson correlation coefficient
Scope of |r| |
Related degree |
0.0≤|r|<0.2 |
Uncorrelated |
0.2≤|r|<0.4 |
Weak correlation |
0.4≤|r|<0.6 |
Moderate correlation |
0.6≤|r|<0.8 |
Strong correlation |
0.8≤|r|≤1.0 |
Highly correlation |
Channel Reduction
The dataset selected in this paper was collected through a high-density array of 128 channels. There is often a high degree of information redundancy between each signal channel, so different signal channels need to be selected to reduce unnecessary signal data. Channel reduction is essentially a combination selection problem [
45]. The maximal information coefficient is used to calculate the correlation between signal channels. The signal channels with high correlation are deleted to complete the reduction of signal channels.
The MIC algorithm was proposed by Reshef et al. [
46] in 2011. Its algorithm principle is to grid the scatter plot of the joint sample of two variables X and Y at a particular scale and then calculate the mutual information value of the two variables by reckoning the marginal probability density function and the joint probability density function in the grid, i.e., numerical values are used to obtain approximate solutions under given boundary conditions [
47]. MIC overcomes the shortcoming of invariable calculation of continuous variables by mutual information [
48], and its calculation formula is shown in Equation (4):
(4)
where, p(x) and p(y) are probability density functions, and p(x,y) is joint probability density function.
On the premise that the number of grids is fixed, the MIC values obtained in all grid partitioning methods are found and normalized to M(D)_(x,y), as shown in Equation (5):
(5)
Given dataset D with sample size n, the MIC calculation formula of two variables X and Y in the set is defined under the premise of limiting grid size x⋅y, as shown in Equation (6) :
(6)
where a is set to 0.6 as a rule of thumb to ensure the universality of the algorithm.
Attention - LSTM Model
LSTM is a variant of RNN that can make full use of historical information and has stronger adaptability in temporal data analysis [
49].
Let $x_t$ and $h_t$ represent the input vector and output state value simultaneously, respectively. The calculation formulas of LSTM are as follows:
(7)
where, f, i, g, C, and o, respectively represent the forgetting gate, input gate, updated cell state, updated cell state, and output gate, W and b, respectively represent the weight coefficient matrix and bias term, and σ and tanh, respectively represent the sigmoid activation function and hyperbolic tangent activation function.
The attention mechanism [
50] is a model used to simulate human attention, which highlights the contribution of key inputs to output and can better optimize the LSTM model.
Let $x_1$,$x_2$,⋯,$x_k$ represent the input sequence, $h_1$,$h_2$,⋯,$h_k$ represent the state value of the hidden layer corresponding to the input sequence, $α_{ki}$ represents the attention weight of the hidden layer state of the historical input to the current input, and $h_{(k' )}$ represents the state value of the hidden layer of the last node of the output. Its calculation formulas are as follows:
(8)
The final feature vector $h_{(k^' )}$ is the hidden vector of the last node:
(9)
The Attention-LSTM model is shown in Fig. 3, including the input vector, two LSTM hidden layers, the Attention layer, the full connection layer, and the output layer. Through the function of the Attention layer, the weights of different input vectors were calculated, and the new vectors were combined to obtain the predicted values.
In this paper, the feature set after feature and channel reduction was taken as the input parameters, and each gesture category was taken as the output parameters. The number of neural units in the LSTM of the first layer was set to 128, and the number of neural units in the LSTM of the second layer was set to 64, so as to compress the input vector parameters of the fully connected layer.
Fig. 3. Attention-LSTM model.
Experiment Platform Parameter Setting
In this experiment, the MATLAB 2016a simulation platform was used to realize the preprocessing of sEMG, the Python language was used in the GPU acceleration environment, and the TensorFlow deep learning framework was used to build the Attention-LSTM gesture recognition model. The computer was configured with a Windows 10 system, an Intel Core I7-9700K CPU, an RTX 2080 SUPER GPU, and 16 GB of memory.
To verify the effectiveness of the Attention-LSTM model, 80% of the CapgMyo database was selected as the training set to train the network model and 20% as the verification set to verify the performance of the model. The initialization parameters of the LSTM network model are shown in Table 3.
Table 3. Network model parameter settings
Model parameter |
Indicator |
Maximum iteration |
30,000 times |
Learning rate |
0.003 |
Bias |
0 |
Number of hidden layer units |
24 |
Optimizer |
Adam |
LSTM steps |
4 |
Experimental Results and Discussion
As shown in Fig. 4(a), the interval range of signals in the action segment is drawn through the RMS curve. Different RMS curve thresholds TH are set for each experimenter; the RMS value lower than TH is set to zero, while the RMS value higher than TH is retained to complete the accurate division of the signal interval in the action segment. The extraction of signal data in the action segment is completed by searching for the starting and ending points of the signal data in the action segment, as shown in Fig. 4(b).
In this paper, 10 time-domain features were extracted from each signal channel data using the sliding window method. Fig. 5 shows a schematic diagram of 10 time-domain feature parameters. The X-axis represents the extracted 10 time-domain feature categories, the Y-axis represents part of the intercepted time-domain feature parameters, and the Z-axis represents the amplitude of feature parameters.

Fig. 4. (a) RMS curve and (b) RMS curve extraction of signal data in the action section.
Fig. 5. Schematic diagram of ten feature parameters in the time domain.
It can be seen from Fig. 6 that the similarity between MAV, WL, RMS, and IEMG is above 0.9, and the similarity between VAR, AAC, SSI, and DASDV is also above 0.9. According to Table 2, the feature reduction threshold TH1=0.4 is set to complete the reduction of feature parameters. Finally, the five time-domain features of MAV, ZC, SSC, AAC, and DASDV were retained and input into the gesture recognition model as the best feature collection.
In the experimental process, under the condition of extracting five and 10 kinds of time-domain features, each signal channel was reduced through MIC. The MIC of each channel was calculated, and the appropriate threshold was set for the channel to complete the reduction of channels. Figs. 7 and 8 show the MIC values of each signal channel of 10 randomly selected testers when extracting five and 10 time-domain features, including five testers in the DB-a dataset and five testers in the DB-c dataset.
Figs. 7 and 8 can clearly reflect the correlation of different degrees among signal channels, and the correlation coefficient is mostly concentrated at 0.6. However, due to the great difference in MIC values of different testers, the range of channel reduction threshold TH2 was set as [0.5–0.9] in this paper, and then an appropriate reduction threshold TH3 was selected. Table 4 shows channel reduction when different TH2 is selected under the extraction of five and 10 features.
Fig. 6. Correlation values of 10 time-domain features.
Combining the results from Table 4, Fig. 7, and Fig. 8, this paper selects an appropriate reduction threshold TH3 between [0.6–0.7]. Table 5 shows the number of signal channels reduced when five and 10 feature parameters are extracted, and a different reduction threshold TH3 is selected.
In this paper, the gesture recognition model was built by LSTM and Attention-LSTM, and the selection of channel reduction threshold TH3 was completed by the classification accuracy of the model. Using the 10 time-domain features mentioned in Section 2.3 and the five time-domain features obtained by saving in Section 2.4, we extracted five or 10 time-domain features from each channel of sEMG data of 23 testers, respectively. As many as 80% of the obtained time-domain feature parameters were used as the training set and the remaining 20% as the test set to verify the accuracy of the LSTM and the Attention-LSTM gesture recognition models. Fig. 9 shows the LSTM classification accuracy and the Attention-LSTM classification accuracy when feature and channel reduction are completed or not.
Table 4. Number of reduced channels corresponding to different TH2
TH2 |
Number of channels reduced |
5 features |
10 features |
0.5 |
105 |
110 |
0.6 |
71 |
75 |
0.7 |
17 |
23 |
0.8 |
1 |
1 |
0.9 |
0 |
0 |
Fig. 7. MIC values of each tester's signal channel under five features.
Fig. 8. MIC values of each tester's signal channel under 10 features.
Table 5. Number of reduced channels corresponding to different TH3
TH3 |
Number of channels reduced |
5 features |
10 features |
0.61 |
69 |
71 |
0.62 |
59 |
65 |
0.63 |
52 |
59 |
0.64 |
45 |
54 |
0.65 |
42 |
48 |
0.66 |
35 |
44 |
0.67 |
28 |
39 |
0.68 |
29 |
31 |
0.69 |
21 |
26 |
0.7 |
17 |
23 |
When 10 types of time-domain features were extracted from sEMG, the classification accuracy obtained by LSTM and Attention-LSTM classification models was higher than that obtained by a model extracting five types of time-domain features, with a maximum difference of 3.26%, as shown in Fig. 9. However, half of the feature parameters were decreased after feature reduction. In order to guarantee classification accuracy, the number of feature parameters was decreased, computing resources were used more sparingly, and the model ran more quickly. The results demonstrated that the classification accuracy for gestures, in some cases, increased with the extracted feature parameters. But concerning this article, under the condition of extracting the features of the species, the greater the dimension of feature parameters, the more demanding the classification model and the experiment platform, so as to guarantee the accuracy of the fundamental situation. Finally, signal data were processed to yield five time-domain features.

Fig. 9. Classification accuracy of LSTM and Attention-LSTM (Label 1 corresponds to LSTM classification accuracy when channel reduction is completed; Labels 2–12 correspond to the LSTM classification accuracy when the channel reduction threshold is [0.6–0.7]; Label 13 corresponds to the classification accuracy of Attention-LSTM when the channel reduction threshold is 0.69).
Comparing Labels 2–12, it can be seen that the LSTM classification model achieves 87.27% and 89.91% classification accuracy, respectively. These results are 1.41% and 1.71% higher than those without channel reduction, proving the effectiveness of the MIC channel reduction scheme. As can be seen, the reduced signal channel must have a lot of redundant information that is superimposed on useful information, which has a negative impact on the model's ability to classify data accurately. Regarding the drop in the corresponding classification accuracy of Label 12, electrical sensor architecture should be considered. The signal acquisition device used in this database comprises a 16×9 different electrode array, each with a small spacing between electrodes that could result in some electrodes having a higher degree of redundant information while others a stronger degree of useful information. These channels unintentionally disappear during channel reduction. However, following the channel reduction operation, the classification accuracy of the model essentially remains unchanged within the useful range, demonstrating the efficiency of the channel reduction scheme.
The classification accuracy of Attention-LSTM after feature and channel reduction is 9.47% than that of LSTM without feature and channel reduction when compared to the corresponding classification accuracy of Label 13 and Label 1. When using less experimental data, Attention-LSTM can still achieve high classification accuracy compared to Labels 1–12 LSTM. The classification outcomes show that even though the LSTM model can produce ideal classification outcomes for gesture recognition, adding an attention mechanism can highlight the significance of key information and increase the precision of gesture recognition.
There are numerous ways to create a gesture recognition model based on HD-sEMG at the moment, but the most popular ones are RNN [
51], CNN [
52], and other deep learning techniques, along with SVM, K-nearest neighbor (KNN) [
53], and other machine learning techniques. Fig. 10 displays the accuracy of each classifier's classification of gestures using the CapgMyo and its subset.
Fig. 10. Comparison of gesture recognition accuracy.
The classification accuracy of CNN-RNN-Attention and 3D CNN-MV is 4.37% and 3.27% than that of MIC-Attention-LSTM, respectively, when compared to the classification accuracy in [44, 45]. The CNN-RNN-Attention and the 3D CNN-MV classify and recognize eight gestures in the CapgMyo-DB-a dataset, whereas this paper classifies 20 gestures in the CapgMyo database by reducing the sEMG acquisition channel. Still, accurate classification of gestures is possible. A comparison between the LSTM used in this paper and the machine learning method used in [46] reveals that the classification accuracy of 20 gestures obtained by LSTM is 14.1% and 16.8% higher than that of SVM and KNN when it classifies only 12 gestures in the CapgMyo-DB-b. This demonstrates that LSTM can effectively capture the time sequence features hidden in sEMG.
Conclusion
This paper proposed a gesture recognition method based on MIC-Attention-LSTM to improve the accuracy of gesture recognition by HD-sEMG and reduce redundancy between HD-sEMG. Based on the CapgMyo database, the experimental analysis was completed, and the linear correlation between 10 time-domain feature parameters was calculated using the correlation number. To complete the reduction of 10 time-domain feature parameters, the threshold TH1 was set. Finally, MAV, ZC, SSC, AAC, and DASDV were chosen as the optimal time-domain features for the feature set. The MIC was able to effectively calculate the similarity of nonlinear data, and the correlation between 128 channels of the signal acquisition was computed. The reduction threshold TH2 was set differently for each channel to obtain distinct signal channel combinations. LSTM and Attention-LSTM were used to build a classification model for gestures. The final channel reduction combination was determined. In the meantime, experimental results demonstrated that the Attention mechanism algorithm could effectively emphasize key information and enhance the classification accuracy of LSTM. In the subsequent work, the motion recognition of patients with upper limb disabilities will be added, the online recognition rate of the system will be improved further under the condition of ensuring the system's real-time performance, and the gesture recognition model will be applied to real-world production and life.
Author’s Contributions
Conceptualization, MJH. Investigation and methodology, YLG. Formal analysis, BH. Writing of the original draft, MJH. Writing of the review and editing, YLG, XJC.
Funding
This paper was supported by the Science and Technology Development Project of Changchun (Grant No. 21ZGM25).
Competing Interests
The authors declare that they have no competing interests.
Author Biography
Please be sure to write the name, affiliation, photo, and biography of all the authors in order.
Only up to 100 words of biography content for each author are allowed.

Name: Ming-Jia Hu
Affiliation: School of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, China
Biography Ming-Jia Hu is a doctoral candidate at Changchun University of Science and Technology, Changchun, China. His research interest covers bioelectrical signal processing and pattern recognition.

Name: Yu-Lin Gong
Affiliation: School of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, China
Biography Yu-Lin Gong received his Ph.D. in mechatronic engineering from Changchun University of Science and Technology, Changchun, China. His research interests covers biomedical signal processing and intelligent control.

Name: Xiao-Juan Chen
Affiliation: School of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, China
Biography Xiao-Juan Chen received her Ph.D. in information and communication systems from Jilin University, Changchun, China. Her research interests covers weak signal intelligent detection, optical fiber intelligent detection and sensing, biomedical signal processing.

Name: Bo Han
Affiliation: School of Electronic Information Engineering, Changchun University of Science and Technology, Changchun, China
Biography Bo Han is a current student in Changchun University of Science and Technology. His research interest covers sEMG processing and pattern recognition.
References
[1]
S. Chen and Z. Luo, “Research on gesture EMG recognition based on long short-term memory and convolutional neural network,” Chinese Journal of Scientific Instrument, vol. 42, no. 2, 1pp. 62-170, 2021.
[2]
J. Qi, G. Jiang, G. Li, Y. Sun, and B. Tao, “Intelligent human-computer interaction based on surface EMG gesture recognition,” IEEE Access, vol. 7, pp. 61378-61387, 2019.
[3]
N. Parajuli, N. Sreenivasan, P. Bifulco, M. Cesarelli, S. Savino, V. Niola, et al., “Real-time EMG based pattern recognition control for hand prostheses: a review on existing methods, challenges and future implementation,” Sensors, vol. 19, no. 20, article no. 4596, 2019.
https://doi.org/10.3390/s19204596
[4]
C. Fang, B. He, Y. Wang, J. Cao, and S. Gao, “EMG-centered multisensory based technologies for pattern recognition in rehabilitation: state of the art and challenges,” Biosensors, vol. 10, no. 8, article no. 85, 2020.
https://doi.org/10.3390/bios10080085
[5]
S. Zhao, X. Wu, X. Zhang, B. Li, J. Mao, and J. Xu, “Automatic gesture recognition with surface electromyography signal,” Journal of Xi'an Jiaotong University, vol. 54, no. 9, pp. 149-156, 2020.
[6]
S. H. Roy, M. S. Cheng, S. S. Chang, J. Moore, G. De Luca, S. H. Nawab, and C. J. De Luca, “A combined sEMG and accelerometer system for monitoring functional activity in stroke,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 17, no. 6, pp. 585-594, 2009.
[7]
N. Malesevic, A. Olsson, P. Sager, E. Andersson, C. Cipriani, M. Controzzi, A. Bjorkman, and C. Antfolk, “A database of high-density surface electromyogram signals comprising 65 isometric hand gestures,” Scientific Data, vol. 8, no. 1, article no. 63, 2021.
https://doi.org/10.1038/s41597-021-00843-9
[8]
D. Tkach, H. Huang, and T. A. Kuiken, “Study of stability of time-domain features for electromyographic pattern recognition,” Journal of Neuroengineering and Rehabilitation, vol, 7, article no. 21, 2010.
https://doi.org/10.1186/1743-0003-7-21
[9]
A. Abdelouahad, A. Belkhou, A. Jbari, and L. Bellarbi, “Time and frequency parameters of sEMG signal-force relationship,” in Proceedings of 2018 4th International Conference on Optimization and Applications (ICOA), Mohammedia, Morocco, 2018, pp. 1-5.
[10]
K. Englehart, B. Hudgin, and P. A. Parker, “A wavelet-based continuous classification scheme for multifunction myoelectric control,” IEEE Transactions on Biomedical Engineering, vol. 48, no. 3, pp. 302-311, 2001.
[11]
P. Phukpattaranont, S. Thongpanja, K. Anam, A. Al-Jumaily, and C. Limsakul, “Evaluation of feature extraction techniques and classifiers for finger movement recognition using surface electromyography signal,” Medical & Biological Engineering & Computing, vol. 56, pp. 2259-2271, 2018.
[12]
H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.
[13]
K. Kiatpanichagij and N. Afzulpurkar, “Use of supervised discretization with PCA in wavelet packet transformation-based surface electromyogram classification,” Biomedical Signal Processing and Control, vol. 4, no. 2, pp. 127-138, 2009.
[14]
Q. Gu, Z. Li, and J. Han, “Generalized fisher score for feature selection,” in Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, Arlington, VA, 20121, pp. 266-273.
[15]
R. Shailendra, A. Jayapalan, S. Velayutham, A. Baladhandapani, A. Srivastava, S. Kumar Gupta, and M. Kumar, “An IoT and machine learning based intelligent system for the classification of therapeutic plants,” Neural Processing Letters, vol. 54, no. 5, pp. 4465-4493, 2022.
[16]
S. Adil, T. Anwar, and A. Al Jumaily, “Extreme learning machine based sEMG for drop-foot after stroke detection,” in Proceedings of 2016 6th International Conference on Information Science and Technology (ICIST), Dalian, China, 2016, pp. 18-22.
[17]
T. Khan, L. E. Lundgren, E. Jarpe, M. C. Olsson, and P. Viberg, “A novel method for classification of running fatigue using change-point segmentation,” Sensors, vol. 19, no. 21, article no. 4729, 2019.
https://doi.org/10.3390/s19214729
[18]
X. M. Pei, J. Q. Song, J. T. Cao, and H. H. Liu. “Surface EMG signal hand motion recognition based on MEMD and TK energy operators,” Journal of Electronic Measurement and Instrumentation, vol. 35, no. 1, pp. 82-87, 2021.
[19]
O. Abu Arqub, Z. Abo-Hammour, S. Momani, and N. Shawagfeh, “Solving singular two-point boundary value problems using continuous genetic algorithm,” Abstract and Applied Analysis, vol. 2012, article no. 205391, 2012.
https://doi.org/10.1155/2012/205391
[20]
M. Kumar, J. Aggarwal, A. Rani, T. Stephan, A. Shankar, and S. Mirjalili, “Secure video communication using firefly optimization and visual cryptography,” Artificial Intelligence Review, vol. 55, pp. 2997-3017, 2022.
https://doi.org/10.1007/s10462-021-10070-8
[21]
T. Zhong, D. Li, J. Wang, J. Xu, Z. An, and Y. Zhu, “Fusion learning for sEMG recognition of multiple upper-limb rehabilitation movements,” Sensors, vol. 21, no. 16, article no. 5385, 2021.
https://doi.org/10.3390/s21165385
[22]
H. Kohli, J. Agarwal, and M. Kumar, “An improved method for text detection using Adam optimization algorithm,” Global Transitions Proceedings, vol. 3, no. 1, pp. 230-234, 2022.
[23]
W. Liu and C. Wang, “Gesture recognition and recovery glove control based on CNN and sEMG,” Journal of Jilin University (Information Science Edition), vol. 38, no. 4, pp. 419-427, 2020.
[24]
Y. Li, X. Jiang, K. Zou, and X. Yuan, “Multi-stream convolutional myoelectric gesture recognition networks fusing attentional mechanisms,” Application Research of Computers, vol. 38, no. 11, pp. 3258-3263, 2021.
[25]
Z. Zhang, C. He, and K. Yang, “A novel surface electromyographic signal-based hand gesture prediction using a recurrent neural network,” Sensors, vol. 20, no. 14, article no. 3994, 2020.
https://doi.org/10.3390/s20143994
[26]
A. Sohane and R. Agarwal, “A single platform for classification and prediction using a hybrid bioinspired and deep neural network (PSO-LSTM),” MAPAN, vol. 37, no. 1, pp. 47-58, 2022.
[27]
W. Geng, Y. Du, W. Jin, W. Wei, Y. Hu, and J. Li, “Gesture recognition by instantaneous surface EMG images,” Scientific Reports, vol. 6, no. 1, article no. 36571, 2016.
https://doi.org/10.1038/srep36571
[28]
W. Jin, Y. Li, and S. Lin, “Design of a novel non-invasive wearable device for array surface electromyogram,” International Journal of Information and Electronics Engineering, vol. 6, no. 2, pp. 139-142, 2016.
[29]
M. Atzori, A. Gijsberts, C. Castellini, B. Caputo, A. G. M. Hager, S. Elsig, G. Giatsidis, F. Bassetto, and H. Muller, “Electromyography data for non-invasive naturally-controlled robotic hand prostheses,” Scientific Data, vol. 1, article no. 140053, 2014.
https://doi.org/10.1038/sdata.2014.53
[30]
Y. Lu, H. Wang, F. Hu, B. Zhou, and H. Xi, “Effective recognition of human lower limb jump locomotion phases based on multi-sensor information fusion and machine learning,” Medical & Biological Engineering & Computing, vol. 59, pp. 883-899, 2021.
[32]
G. Liu, L. Wang, and J. Wang, “A novel energy-motion model for continuous sEMG decoding: from muscle energy to motor pattern,” Journal of Neural Engineering, vol. 18, no. 1, article no. 016019, 2021.
https://doi.org/10.1088/1741-2552/abbece
[33]
X. Zhang, X. Chen, Y. Li, V. Lantz, K. Wang, and J. Yang, “A framework for hand gesture recognition based on accelerometer and EMG sensors,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 41, no. 6, pp. 1064-1076, 2011.
[34]
Z. Zhang, C. He, and K. Yang, “A novel surface electromyographic signal-based hand gesture prediction using a recurrent neural network,” Sensors, vol. 20, no. 14, article no. 3994, 2020.
https://doi.org/10.3390/s20143994
[35]
Z. Wang, Y. Xun, G. Bao, F. Gao, Q. Yang, and L. Zhang, “Classification and identification of multi-pattern of hand actions,” China Mechanical Engineering, vol. 30, no. 12, pp. 1474-1479, 2019.
https://doi.org/10.3390/s20143994
[37]
J. Song, A. Zhu, Y. Tu, H. Huang, M. A. Arif, Z. Shen, X. Zhang, and G. Cao, “Effects of different feature parameters of sEMG on human motion pattern recognition using multilayer perceptrons and LSTM neural networks,” Applied Sciences, vol. 10, no. 10, article no. 3358, 2020.
https://doi.org/10.3390/app10103358
[38]
B. Hudgins, P. Parker, and R. N. Scott, “A new strategy for multifunction myoelectric control,” IEEE Transactions on Biomedical Engineering, vol. 40, no. 1, pp. 82-94, 1993.
[39]
S. Yu, Y. Chen, Q. Cai, K. Ma, H. Zheng, and L. Xie, “A novel quantitative spasticity evaluation method based on surface electromyogram signals and adaptive neuro fuzzy inference system,” Frontiers in Neuroscience, vol. 14, article no. 462, 2020.
https://doi.org/10.3389/fnins.2020.00462
[40]
S. Pizzolato, L. Tagliapietra, M. Cognolato, M. Reggiani, H. Muller, and M. Atzori, “Comparison of six electromyography acquisition setups on hand movement classification tasks,” PLoS One, vol. 12, no. 10, article no. e0186132, 2017.
https://doi.org/10.1371/journal.pone.0186132
[41]
A. Phinyomark, P. Phukpattaranont, and C. Limsakul, “Feature reduction and selection for EMG signal classification,” Expert Systems with Applications, vol. 39, no. 8, pp. 7420-7431, 2012.
[42]
Z. Li, H. Zhao, J. Mei, and H. Shen, “Method of vibration signal eigenvalue selection by correlation coefficient diagram,” Modern Electronics Technique, vol. 43, no. 15, pp. 29-32+36, 2020.
[43]
O. A. Arqub, “Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm–Volterra integrodifferential equations,” Neural Computing and Applications, vol. 28, pp. 1591-1610, 2017.
[44]
S. Zhang, J. N. Lv, Z. Jiang, and L. Zhang, “Study of the correlation coefficients in mathematical statistics,” Mathematics in Practice and theory, vol. 39, no. 19, pp. 102-107, 2009.
[45]
Y. Z. Chen, “Motion pattern recognition of sEMG signals based upper limb self-rehabilitation training,” Shandong University, Jinan, China, 2015.
[46]
D. N. Reshef, Y. A. Reshef, H. K. Finucane, S. R. Grossman, G. McVean, P. J. Turnbaugh, E. S. Lander, M. Mitzenmacher and P. C. Sabeti, “Detecting novel associations in large data sets,” Science, vol. 334, no. 6062, pp. 1518-1524, 2011.
[47]
O. A. Arqub and Z. Abo-Hammour, “Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm,” Information Sciences, vol. 279, pp. 396-415, 2014.
[48]
Z. Wang, H. Song, S. Li, and X. Zhou, “Process monitoring based on logarithmic transformation and maximal information coefficient-PCA,” Science Technology and Engineering, vol. 17, no. 16, pp. 259-265, 2017.
[49]
W. Peng, J. Wang, and S. Yin, “Short-term load forecasting model based on attention-LSTM in electricity market,” Power System Technology, vol. 43, no. 5, pp. 1745-1751, 2019.
[50]
M. Zhu and X. Lu, “Human action recognition algorithm based on Bi-LSTM Attention model,” Laser & Optoelectronics Progress, vol. 56, no. 15, pp. 153-161, 2019.
[51]
Y. Hu, Y. Wong, W. Wei, Y. Du, M. Kankanhalli, and W. Geng, “A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition,” PLoS One, vol. 13, no. 10, article no. e0206049, 2018.
https://doi.org/10.1371/journal.pone.0206049
[52]
J. Chen, S. Bi, G. Zhang, and G. Cao, “High-density surface EMG-based gesture recognition using a 3D convolutional neural network,” Sensors, vol. 20, no. 4, article no. 1201, 2020.
https://doi.org/10.3390/s20041201
[53]
S. Padhy, “A tensor-based approach using multilinear SVD for hand gesture recognition from SEMG signals,” IEEE Sensors Journal, vol. 21, no. 5, pp. 6634-6642, 2020.
About this article
Cite this article
Ming-Jia Hu, Yu-Lin Gong*, Xiao-Juan Chen, and Bo Han, A Gesture Recognition Method Based on MIC-Attention-LSTM, Article number: 13:21 (2023) Cite this article 2 Accesses
Download citation
- Received29 April 2022
- Accepted3 July 2022
- Published15 May 2023
Share this article
Anyone you share the following link with will be able to read this content:
Provided by the Springer Nature SharedIt content-sharing initiative
Keywords