中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98178
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83776/83776 (100%)
Visitors : 59190733      Online Users : 611
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98178


    Title: 以深度學習預測二胡裝飾音之研究: 原始錄製資料的應用分析;A Study on Predicting Erhu Ornamentation Using Deep Learning and Original Recording Audio Data
    Authors: 張祥洲;ZHANG, XIANG-ZHOU
    Contributors: 資訊工程學系
    Keywords: 脈衝編碼調變(PCM);裝飾音;Pulse-code modulation(PCM);Erhu Ornamentation;TimesFM
    Date: 2025-06-27
    Issue Date: 2025-10-17 12:27:30 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 本研究旨在探討以深度學習模型 TimesFM 應用於二胡演奏中裝飾音之音訊預測能力。在當前音樂生成研究中,主要著重於旋律生成、伴奏生成、風格轉換、音樂結構建模與聲碼器技術等方向,較少針對如滑音、顫音、倚音等演奏層次的裝飾技法進行細緻建模。為此,本研究建置涵蓋五種常見二胡裝飾音的資料集,將錄製之音訊轉換為 PCM 資料後,透過不同取樣率壓縮為適當長度之時間序列,再切割成單音資料,以建立結構化的輸入輸出對。模型訓練以指定的基礎音作為輸入,學習對應的裝飾音音訊表現,進而實現裝飾音的預測與生成。本研究採用 TimesFM 模型進行訓練與推論,輸出以中位數分位點(quantile = 0.5)作為主要預測結果,並使用 Huber loss 作為損失函數。實驗結果顯示,模型所生成之音訊在音色連貫性與裝飾技法風格表現上具有可辨識性。本研究驗證了基於原始 PCM 音訊資料進行時間序列建模在傳統樂器裝飾音生成上的可行性,拓展了音樂生成技術在演奏細節建模方面的應用潛力。;This study investigates the audio prediction capabilities of the deep learning model TimesFM in the context of ornamentation techniques in Erhu performance. Current research in music generation primarily focuses on melody generation, accompaniment synthesis, style transfer, music structure modeling, and vocoder technologies, while detailed modeling of performance-level ornamentations—such as glissando, vibrato, and appoggiatura—has received comparatively less attention. To address this gap, this study constructed a dataset encompassing five common types of Erhu ornamentations. The recorded audio was converted into PCM data, then compressed into time series of suitable length via different sampling rates, and finally segmented into monophonic units to establish structured input-output pairs. The model was trained to predict and generate ornamented audio expressions given a specified base tone as input. The TimesFM model was employed for both training and inference, using the median quantile (quantile = 0.5) as the primary output and Huber loss as the loss function. Experimental results indicate that the generated audio exhibits recognizable timbral coherence and stylistic features of the ornamentations. This study demonstrates the feasibility of modeling ornamentations in traditional instruments using time series based on raw PCM audio data, and expands the application potential of music generation technologies in capturing performance-level details.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML7View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明