dc.description.abstract | In recent studies, speech emotion recognition has become an interesting and challenging area of research in human behavior analysis. The goal of this research area is to classify people′s emotional states based on their speech tones. Currently, the research area focuses on identifying the effectiveness of automatic classifiers of speech emotions to improve the classification efficiency in practical applications, e.g., for use in telecommunication services, identifying positive emotions (e.g., happiness, surprise) and negative emotions (e.g., sadness, anger, disgust, and fear), which can provide a large amount of valid data for platform users and customers of telecommunication services.
In this paper, the complex task of identifying positive and negative emotions in human voice data is investigated by using deep learning techniques. Five open sentiment speech datasets and four self-generated speech datasets are used to train multi-level models for positive and negative sentiment recognition, which provide good results for both positive and negative sentiment speech data. In addition, a pre-trained model (seven types of emotion recognition models) was used to initialize the network parameters, and a random initialization of the network parameters by Train from scratch to compare the two is doing the classification of three groups of speech detection. According to the experimental results, in this paper
The best model for both tasks is the pre-training model, which is significantly better than the Train from a scratch model. | en_US |