首頁 > 網路資源 > 大同大學數位論文系統

Title page for etd-0902114-131251


URN etd-0902114-131251 Statistics This thesis had been viewed 1333 times. Download 639 times.
Author Chung-yu Kuo
Author's Email Address No Public.
Department Computer Science and Enginerring
Year 2013 Semester 2
Degree Master Type of Document Master's Thesis
Language zh-TW.Big5 Chinese Page Count 37
Title A Cloud Speech Emotion Recognition System with Handheld Device as the Front End
Keyword
  • speech emotion recognition
  • GMM
  • MFCC
  • handheld device
  • Cloud computing
  • Cloud computing
  • handheld device
  • MFCC
  • GMM
  • speech emotion recognition
  • Abstract It is unavoidable to interact with other people in our daily life. In the interaction, people will exchange not only the context but also emotions. The emotion may be expressed by face expression, body gesture, or embedded in the speech signal. A speech emotion recognition system is a system that can recognize the emotion of speaker from the speech signal he or she generated. From the results of researches done in recent years, the recognition rate of this kind of system is higher enough for practical use. However, high recognition rate comes with high computational complexity. If we want to implement the emotion recognition system in the handheld device, the resource of the device is generally not sufficient for high recognition rate algorithms.
    In this research, we propose a cloud-based speech emotion recognition system. We transmit the voice signal which is captured by the handheld device to the cloud server through internet. The speech emotion recognition engine in the server will then perform the recognition of emotion embedded in the speech signal and responded the recognition result back to the device. The device then displays the recognition result in its screen.
    In the proposed system, we implement a cloud speech emotion recognition system with the handheld device as the front end. The emotion recognition engine in the cloud server used the Mel-Frequency Cepstral Coefficients as the speech feature and Gaussian Mixture Model as the classifier. In the handheld device side, it capture the voice signal, transmit the captured signal to the server, receive the result from the server, and display the result on the screen. Therefore, the resources required by the front end handheld device is not high. The experimental results reveal that a six emotion recognition using mean MFCC as the feature parameter can achieve a recognition rate of 48.8%.
    Advisor Committee
  • Tsang-Long Pao - advisor
  • Ching-Kuen Lee - co-chair
  • Yu Tsao - co-chair
  • Files indicate access worldwide
    Date of Defense 2014-07-29 Date of Submission 2014-09-02


    Browse | Search All Available ETDs