Determination of optimum threshold values for EMG time domain features; a multi-dataset investigation


Objective. For over two decades, Hudgins' set of time domain features have extensively been applied for classification of hand motions. The calculation of slope sign change and zero crossing features uses a threshold to attenuate the effect of background noise. However, there is no consensus on the optimum threshold value. In this study, we investigate for the first time the effect of threshold selection on the feature space and classification accuracy using multiple datasets. Approach. In the first part, four datasets were used, and classification error (CE), separability index, scatter matrix separability criterion, and cardinality of the features were used as performance measures. In the second part, data from eight classes were collected during two separate days with two days in between from eight able-bodied subjects. The threshold for each feature was computed as a factor (R = 0:0.01:4) times the average root mean square of data during rest. For each day, we quantified CE for R = 0 (CEr0) and minimum error (CEbest). Moreover, a cross day threshold validation was applied where, for example, CE of day two (CEodt) is computed based on optimum threshold from day one and vice versa. Finally, we quantified the effect of the threshold when using training data from one day and test data of the other. Main results. All performance metrics generally degraded with increasing threshold values. On average, CEbest (5.26 ± 2.42%) was significantly better than CEr0 (7.51 ± 2.41%, P = 0.018), and CEodt (7.50 ± 2.50%, P = 0.021). During the two-fold validation between days, CEbest performed similar to CEr0. Interestingly, when using the threshold values optimized per subject from day one and day two respectively, on the cross-days classification, the performance decreased. Significance. We have demonstrated that threshold value has a strong impact on the feature space and that an optimum threshold can be quantified. However, this optimum threshold is highly data and subject driven and thus do not generalize well. There is a strong evidence that R = 0 provides a good trade-off between system performance and generalization. These findings are important for practical use of pattern recognition based myoelectric control.