对于构建语音情感检测和识别系统,哪种方法会更好?隐马尔可夫模型或深度学习(RNN-LSTM)方法?我必须建立一个SER系统,我在两者之间感到困惑。如果有比这两个更好的模型,请告诉。
答案 0 :(得分:1)
HMM and RNN-LSTM based solutions are not considered highly accurate for SER. I believe the ranking algorithm to date is one based on Deep Retinal Convolution Neural Networks (DRCNNs). See Speech emotion recognition using Deep Retinal Convolution Neural Networks, authored by Niu, Yafeng; Zou, Dongsheng; Niu, Yadong; He, Zhongshi; Tan, Hua and published in July of 2017. The authors achieved an average accuracy over 99% on the following databases: IEMOCAP, EMO-DB, and SAVEE.