透過生成式 AI 增強嬰兒哭聲分類 模型效能之研究;A Study on Enhancing the Performance of Infant Cry Classification Models Using Generative AI

NCU Institutional Repository > 管理學院 > 企業管理研究所 > 博碩士論文 > Item 987654321/94880

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/94880

題名:	透過生成式 AI 增強嬰兒哭聲分類模型效能之研究;A Study on Enhancing the Performance of Infant Cry Classification Models Using Generative AI
作者:	楊千郁;Yang, Chien-Yu
貢獻者:	企業管理學系
關鍵詞:	嬰兒哭聲;深度學習;生成對抗網路;長短期記憶網路;Infant cry;deep learning;Generative Adversarial Networks;Long Short-Term Memory
日期:	2024-07-15
上傳時間:	2024-10-09 15:35:24 (UTC+8)
出版者:	國立中央大學
摘要:	嬰兒的哭聲如同成人的言語，嬰兒透過哭泣來表達需求以及感受，使照護者能夠接受到訊息並且提供對應之照護，然而，現今並無較完整的嬰兒哭聲資料集，這導致在使用深度學習模型預測哭聲對應之需求時的成效不佳。本研究旨在探索使用生成對抗網路（GAN）改善嬰兒哭聲分類模型的效能。由於現有的嬰兒哭聲資料集有限，導致在深度學習分類模型的預測效能不佳。為了解決這一項問題，我們提出了使用 GAN 生成額外的嬰兒哭聲樣本，進行擴充訓練資料集的方法，進而提高分類模型之預測效能。在本研究中，我們預先收集了五種需求（生氣、肚子餓、缺乏安全感、大小便、想睡覺）下的真實嬰兒哭聲的樣本，並使用 WaveGAN 生成模型來逐一生成各個需求下額外的嬰兒哭聲樣本。使用這些生成的樣本加入原始資料集產生之新資料集與未加入樣本之原始資料集進行模型訓練，並採用長短期記憶網絡（LSTM）深度學習模型進行嬰兒哭聲需求之分類。實驗結果表明，原始資料加入生成的資料後產生的新資料集訓練模型在測試集上的性能顯著優於使用原始資料集訓練的模型。這表明，使用 GAN 能夠有效擴充訓練資料集，提高模型的泛化能力和準確性。綜合以上結果，我們認為使用 GAN 生成額外的嬰兒哭聲樣本是一種有效的方法，可以改善嬰兒哭聲分類模型在預測上的效能。這對於提高嬰兒之健康照護的水準和效率具有重要意義及貢獻。;Infant cries, akin to adult speech, serve as a means for infants to express their needs and feelings, allowing caregivers to receive cues and provide corresponding care. However, there lacks a comprehensive dataset of infant cries, leading to suboptimal performance in using deep learning models to predict infant needs based on cry sounds. This study aims to explore the improvement of infant cry classification models using Generative Adversarial Networks （GANs）. Due to the limited availability of infant cry datasets, the predictive performance of deep learning classification models is compromised. To address this issue, we propose using GANs to generate additional infant cry samples to augment the training dataset and subsequently enhance the predictive performance of the classification model. In this study, we collected samples of real infant cries corresponding to five different needs （anger, hunger, insecured, poopee, sleepy） in advance. We then utilized the WaveGAN to generate additional infant cry samples for each need category. The generated samples were combined with the original dataset to form a new augmented dataset. Subsequently, this augmented dataset was used for model training, while the original dataset was also separately utilized for training. The performances of models trained on the augmented dataset and the original dataset were compared individually. We employed Long Short-Term Memory （LSTM） deep learning models for the classification of infant cry needs. iii The experimental results demonstrate that the model trained on the augmented dataset, which incorporates the generated data, significantly outperforms the model trained solely on the original dataset. This indicates that GANs effectively augment the training dataset, thereby improving the model′s generalization ability and accuracy. In conclusion, we believe that using GANs to generate additional infant cry samples is an effective approach to enhance the predictive performance of infant cry classification models. This contributes significantly to improving the standards and efficiency of infant healthcare.
顯示於類別:	[企業管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	275	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....