結合深度監督式學習與強化式學習的音樂旋律生成;Combining Deep Supervised Learning and Reinforcement Learning for Music Melody Generation

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/93043

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93043

题名:	結合深度監督式學習與強化式學習的音樂旋律生成;Combining Deep Supervised Learning and Reinforcement Learning for Music Melody Generation
作者:	黃千豪;Huang, Chien-Hao
贡献者:	資訊工程學系
关键词:	人工智慧;深度學習;音樂生成;Artificial Intelligence;Deep Learning;Music Generation
日期:	2023-07-11
上传时间:	2024-09-19 16:39:15 (UTC+8)
出版者:	國立中央大學
摘要:	在本篇研究中，我們結合了深度監督式學習與強化式學習進行符號化音樂的生成。使用深度學習對符號化音樂進行建模的相關任務時，音樂的片段可以被當作沿著時間的符號序列來處理，因此通常會使用具有時間訊息建模能力的模型，如同文本建模或自然語言處理等其他序列建模任務。在這種監督式的方法中，深度神經網路可以自動地從現存的資料庫中抓取音樂性的特徵。然而，音樂的譜曲通常包含一些定義完整的結構和慣用的樂理規則，對聽眾來說較為悅耳。這些約束可以使用強化式學習強加到神經網路中，而單純使用監督式學習的技術較難達成。透過結合這兩種深度學習的主要訓練架構，我們可以讓模型模仿現存資料庫的風格，並且控制生成旋律的特定表現。我們還研究了輸入表式與架構的設計讓模型更容易的抓取音樂的結構特徵。在實驗中，我們主要聚焦在中國江南音樂的單音旋律生成，並驗證生成結果的品質與特性，以及架構中不同模組的有效性。;In this work, we present a symbolic music melody generation method that combines supervised learning and reinforcement learning. For using deep learning in symbolic music modeling tasks, music clips can be processed as sequences of symbols along time, so sequence models with the temporal information modeling ability usually be used, just like other sequential modeling tasks, such as text modeling or natural language processing. In these kind of supervised approaches, deep neural network is able to capture the musical features from the existing dataset automatically. However, music compositions by human composers usually have some well-defined structures and conventional rules of music theory that please the audience. These constraints can be enforced into neural network using reinforcement learning which cannot achieve using supervised learning techniques only. By combining these two major training architectures in deep learning, we can make the model mimic the style of the existing dataset and also control specific behaviors of the generated melody. We also investigate the design of input representation and architecture to make the model capture the music structure feature easier. In the experiments, we focus on monophonic melody generation of Chinese Jiangnan style music, and validate the quality and some characteristics of the generated result, as well as the effectiveness of different modules in the architecture.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	5	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....