中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/90827
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78937/78937 (100%)
Visitors : 39182148      Online Users : 362
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/90827


    Title: 深度神經網路於音訊、語音和影像之研究;Deep Neural Networks for Audio, Speech, and Image Applications
    Authors: 鄧氏陲殷;An, Dang Thi Thuy
    Contributors: 資訊工程學系
    Keywords: EMix;語音情緒辨識;聲學場景分類;MixStyleFreq;影 像檢索;影像檢索;基於内容的影像檢索;美容產品影像檢索;EMix;Speech Emotion Recognition;Acoustic Scene Classification;MixStyleFreq;image retrieval;content based image retrieval;beauty product image retrieval
    Date: 2023-02-23
    Issue Date: 2023-05-09 18:07:20 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 這項工作旨在為人工智能領域的幾個問題的發展做出貢獻,包括語音情緒辨識 (SER)、聲學場景分類 (ASC) 和基於内容的影像檢索 (CBIR)。 這些問題來自各個領域,並有許多實際應用。例如,SER 可用於人機交互和心理保健,而 ASC 有助於了解周圍環境,這對於機器人導航、情境感知和監控應用非常有用。CBIR 涉及根據給定的查詢影像識別數據庫中的相關影像,可用於各種類型的影像檢索。 在本論文中,我們提出了使用深度神經網絡 (DNN) 來解決這些問題的方法。
    具體來說,我們針對 SER 問題開發了一種簡單而有效的數據增強 (DA) 方法。 由於數據稀缺和標籤模糊,SER 很困難,DNN 模型容易過度擬合,這會導致測試數據泛化能力差。我們的 DA 方法創建的新數據樣本可能比原始數據樣本噪聲更大或模糊性更低,並且在我們對兩個公共數據集的實驗中,它證明了優於其他 DA 方法。 在 ASC 中,我們關注在跨設備設置中使用 DNN 模型時性能下降的問題,其中訓練和測試數據使用不同的設備記錄。我們提出了一個具有兩種 DA 方法的 ASC 系統:用於減少域間隙的 MixStyleFreq 和用於減輕 DNN 對主導設備的偏差的頻譜校正。 與其他 DA 方法相比,這些方法顯著提高了泛化性能,並取得了有競爭力的結果。 最後,我們針對 CBIR 中的美容產品影像檢索問題開發了一個完全端到端的 DNN 模型。 該模型不需要手動特徵聚合或後處理,在 Perfect-500K 數據集上的實驗結果顯示了其有效性和高檢索精度。
    ;The work aims to contribute to the development of several problems in the field of artificial intelligence, including speech emotion recognition (SER), acoustic scene classification (ASC), and content-based image retrieval (CBIR). These problems come from various domains and have many practical applications. For example, SER can be used in human-machine interaction and mental healthcare, while ASC helps to understand the surrounding environment, which is useful for robot navigation, context awareness, and surveillance applications. CBIR involves identifying relevant images in a database based on a given query image, and can be used in various types of image search. In this thesis, we propose approaches using deep neural networks (DNNs) to address these problems.
    Specifically, we develop a simple yet effective data augmentation (DA) method for the SER problem. SER is difficult due to the scarcity of data and ambiguity of labels, and DNN models are prone to overfitting, which can lead to poor generalization on test data. Our DA method creates new data samples that may be noisier or less ambiguous than the original ones, and in our experiments with two public datasets, it demonstrates superiority over other DA methods. In ASC, we focus on the problem of performance degradation when DNN models are used in a cross-device setting, where the train and test data are recorded using different devices. We propose an ASC system with two DA methods: MixStyleFreq to reduce domain gaps, and spectrum correction to mitigate the bias of DNNs toward dominant devices. These methods significantly improve the generalization performance compared to other DA methods and achieve competitive results. Finally, we develop a fully end-to-end DNN model for the beauty product image retrieval problem in CBIR. This model requires no manual feature aggregation or post-processing, and experimental results on the Perfect-500K dataset show its effectiveness with high retrieval accuracy.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML67View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明