摘要: | 嬰兒的哭聲如同成人的言語,嬰兒透過哭泣來表達需求以及感受,使照護者 能夠接受到訊息並且提供對應之照護,然而,現今並無較完整的嬰兒哭聲資料 集,這導致在使用深度學習模型預測哭聲對應之需求時的成效不佳。 本研究旨在探索使用生成對抗網路(GAN)改善嬰兒哭聲分類模型的效能。 由於現有的嬰兒哭聲資料集有限,導致在深度學習分類模型的預測效能不佳。為 了解決這一項問題,我們提出了使用 GAN 生成額外的嬰兒哭聲樣本,進行擴充訓 練資料集的方法,進而提高分類模型之預測效能。 在本研究中,我們預先收集了五種需求(生氣、肚子餓、缺乏安全感、大小 便、想睡覺)下的真實嬰兒哭聲的樣本,並使用 WaveGAN 生成模型來逐一生成 各個需求下額外的嬰兒哭聲樣本。使用這些生成的樣本加入原始資料集產生之新 資料集與未加入樣本之原始資料集進行模型訓練,並採用長短期記憶網絡 (LSTM)深度學習模型進行嬰兒哭聲需求之分類。實驗結果表明,原始資料加入 生成的資料後產生的新資料集訓練模型在測試集上的性能顯著優於使用原始資料 集訓練的模型。這表明,使用 GAN 能夠有效擴充訓練資料集,提高模型的泛化能 力和準確性。綜合以上結果,我們認為使用 GAN 生成額外的嬰兒哭聲樣本是一種 有效的方法,可以改善嬰兒哭聲分類模型在預測上的效能。這對於提高嬰兒之健 康照護的水準和效率具有重要意義及貢獻。;Infant cries, akin to adult speech, serve as a means for infants to express their needs and feelings, allowing caregivers to receive cues and provide corresponding care. However, there lacks a comprehensive dataset of infant cries, leading to suboptimal performance in using deep learning models to predict infant needs based on cry sounds. This study aims to explore the improvement of infant cry classification models using Generative Adversarial Networks (GANs). Due to the limited availability of infant cry datasets, the predictive performance of deep learning classification models is compromised. To address this issue, we propose using GANs to generate additional infant cry samples to augment the training dataset and subsequently enhance the predictive performance of the classification model. In this study, we collected samples of real infant cries corresponding to five different needs (anger, hunger, insecured, poopee, sleepy) in advance. We then utilized the WaveGAN to generate additional infant cry samples for each need category. The generated samples were combined with the original dataset to form a new augmented dataset. Subsequently, this augmented dataset was used for model training, while the original dataset was also separately utilized for training. The performances of models trained on the augmented dataset and the original dataset were compared individually. We employed Long Short-Term Memory (LSTM) deep learning models for the classification of infant cry needs. iii The experimental results demonstrate that the model trained on the augmented dataset, which incorporates the generated data, significantly outperforms the model trained solely on the original dataset. This indicates that GANs effectively augment the training dataset, thereby improving the model′s generalization ability and accuracy. In conclusion, we believe that using GANs to generate additional infant cry samples is an effective approach to enhance the predictive performance of infant cry classification models. This contributes significantly to improving the standards and efficiency of infant healthcare. |