PT Unknown
AU Lei Kang
   Lichao Zhang
   Dazhi Jiang
TI Learning Robust Self-Attention Features for Speech Emotion Recognition with Label-Adaptive Mixup
BT IEEE International Conference on Acoustics, Speech and Signal Processing
PY 2023
DI 10.1109/ICASSP49357.2023.10095611
AB Speech Emotion Recognition (SER) is to recognize human emotions in a natural verbal interaction scenario with machines, which is considered as a challenging problem due to the ambiguous human emotions. Despite the recent progress in SER, state-of-the-art models struggle to achieve a satisfactory performance. We propose a self-attention based method with combined use of label-adaptive mixup and center loss. By adapting label probabilities in mixup and fitting center loss to the mixup training scheme, our proposed method achieves a superior performance to the state-of-the-art methods.
ER