首頁 > 網路資源 > 大同大學數位論文系統

Title page for etd-0719108-211656


URN etd-0719108-211656 Statistics This thesis had been viewed 4508 times. Download 1581 times.
Author Yu-te Chen
Author's Email Address No Public.
Department Computer Science and Enginerring
Year 2007 Semester 2
Degree Ph.D. Type of Document Doctoral Dissertation
Language English Page Count 152
Title A STUDY OF EMOTION RECOGNITION ON MANDARIN SPEECH AND ITS PERFORMANCE EVALUATION
Keyword
  • performance evaluation
  • Mandarin emotional speech recognition
  • WD-KNN
  • WD-KNN
  • Mandarin emotional speech recognition
  • performance evaluation
  • Abstract It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural.
    In the past, several classifiers were adopted independently and tested on several emotional speech corpora with different language, size, number of emotional states and recording method. This makes it difficult to compare and evaluate the performance of those classifiers. In this thesis, we proposed a weighted discrete k-nearest neighborhood (WD-KNN) classification algorithm and compared it with several classification methods to evaluate their performance by applying them to the same Mandarin emotional speech corpus.
    We first implemented a baseline system to determine the parameter k in KNN based classifiers and to select the best feature set. The results of different values of k in KNN classifier showed that the best performance 70.7% is achieved when the value of k is set to 10. To be fair in the comparison of the experiments, k is set to 10 in the KNN-based classifiers throughout this thesis. The best feature set includes LPC, LPCC, and MFCC. Compared to the performance before feature selection, the accuracy is improved 2.1% as the number of feature types are eliminated from 13 to 3.
    Next, we focused on comparison of different weighting schemes on KNN-based classifiers, including traditional K-Nearest Neighborhood (KNN), weighted KNN (WKNN), KNN classification using Categorical Average Patterns (WCAP), and WD-KNN. Compared to the baseline performance, the largest accuracy improvement of 4.9%, 2.8% and 12.3% can be achieved in these classifiers. The highest recognition rate is 81.4% with WD-KNN classifier weighted by Fibonacci sequence.
    Then we evaluated the performance of several classifiers, including KNN, MKNN, WKNN, LDA, QDA, GMM, HMM, SVM, BPNN, and the proposed WD-KNN, for detecting emotion from Mandarin speech. The results of experiments and McNemar’s test show that the proposed WD-KNN classifier achieves best accuracy for the 5-class emotion recognition and outperforms other classification techniques. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions and 2000 utterances. The experimental results still show that the proposed WD-KNN outperforms others.
    Finally, we implemented an emotion radar chart which is based on WD-KNN and can present the intensity of each emotion component in the speech in our emotion recognition system. Such system can be further used in speech training, especially for hearing-impaired to learn how to express emotions in speech more naturally.
    Advisor Committee
  • Tsang-long Pao - advisor
  • Chung-chun Kung - co-chair
  • Hsuan-shih Lee - co-chair
  • Kun-huang Huarng - co-chair
  • Liang-teh Lee - co-chair
  • Yo-ping Huang - co-chair
  • Files indicate access worldwide
    Date of Defense 2008-07-04 Date of Submission 2008-07-19


    Browse | Search All Available ETDs