使用生成對抗網路進行音樂風格轉換;Using Generative Adversarial Network for Music Transformation

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/93063

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93063

題名:	使用生成對抗網路進行音樂風格轉換;Using Generative Adversarial Network for Music Transformation
作者:	丁婉芩;Ting, Wan-Chin
貢獻者:	資訊工程學系
關鍵詞:	音樂風格轉換;生成對抗網路 (GAN);自動音樂生成;Pianoroll圖片;樂曲起落音;Music style transfer;Generative Adversarial Network (GAN);Automatic music generation;Pianoroll images;Anacrusis and Coda
日期:	2023-07-13
上傳時間:	2024-09-19 16:40:23 (UTC+8)
出版者:	國立中央大學
摘要:	自動音樂生成是指使用計算機算法和人工智能技術來創建音樂的過程，發展已有悠久的歷史，可以追溯至 20 世紀中葉。過去的研究者使用不同的方法和技術，從開發規則系統、進化算法，到機器學習和神經網路等技術興起，自動音樂生成取得了顯著的進展，能夠生成更具創造性和多樣性的音樂作品。另一方面，數位音訊處理技術的影響使得音樂的分析和轉換變得更加容易和準確。自動音樂生成在應用方面範圍廣泛，不僅為音樂創作提供了新的可能性和靈感來源，同時也可以幫助人們節省時間和資源，快速生成符合需求的音樂作品。 Gatys 等人[1]在神經網絡的背景下引入了風格轉換這一術語，通常指的是保留圖像的明確內容特徵，並將另一張圖像的顯著風格特徵應用於該圖像。音樂風格轉換通過將不同音樂作品的音樂內容和音樂風格進行分離和重新組合，產生具有創造性和人工合成特徵的新穎音樂。音樂風格可以指代任何音樂特徵的不同層次。內容和風格之間的界線是高度動態的，取決於音色、演奏風格或作曲的不同目標函數，與不同的風格轉換問題有關[2]。本研究提出了一種利用生成對抗網絡(GAN)[3]框架實現音樂風格轉換的方法。我們將音樂前處理轉換為 pianoroll 圖像的形式，將音樂視為圖片，並運用 CycleGAN 模型[4]進行起落音和完整樂曲的風格轉換。這使得使用者只需提供起落音，即可獲得對應的完整樂曲。在方法實現上，我們不僅使用了深度學習框架，還運用了音樂專業資訊進行資料處理，以進一步分析和優化轉換結果，提升了轉換的質量和實用性。我們還比較了不同生成器和判別器架構在我們的資料集上的表現能力。該方法的優勢在於能夠自動生成對應的完整樂曲，從而提高了音樂風格轉換的實用性。這項研究為音樂風格轉換和自動音樂生成領域提供了新的思路和方法。未來的研究方向可以進一步探索不同音樂風格之間的轉換，並應用於音樂創作、音樂教育等領域，豐富和拓展音樂創作的可能性。;Automatic music generation refers to the process of creating music using computer algorithms and artificial intelligence techniques. It has a long history and can be traced back to the mid-20th century. Researchers have employed various methods and techniques over the years, ranging from rule-based systems and evolutionary algorithms to the emergence of machine learning and neural networks. Automatic music generation has made significant progress, enabling the generation of more creative and diverse music compositions. Furthermore, the impact of digital audio processing technology has made music analysis and transformation easier and more accurate. Automatic music generation has a wide range of applications. It not only provides new possibilities and sources of inspiration for music composition but also helps save time and resources by quickly generating music compositions that meet specific requirements. In the context of neural networks, Gatys et al. [1] introduced the term "style transfer," which typically refers to preserving the explicit content features of an image and applying the salient style features of another image to it. In the case of music, style transfer involves separating and recombining the musical content and musical style of different music compositions to generate novel music with creative and synthesized characteristics. Music style can refer to different levels of musical features, and the boundary between content and style is highly dynamic, depending on factors such as timbre, performance style, or compositional objectives, which are associated with different style transfer problems [2]. This study proposes a method for music style transfer using the Generative Adversarial Network (GAN) framework [3]. We transform music preprocessing into pianoroll images, treating music as images, and utilize the CycleGAN model [4] for style transfer of musical phrases and complete compositions. This allows users to provide only musical phrases and obtain corresponding complete compositions. In the implementation of the method, we not only employ deep learning frameworks but also utilize domain-specific music knowledge for data processing to further analyze and optimize the transformation results, enhancing the quality and practicality of the conversion. We also compare the performance of different generator and discriminator architectures on our dataset. The advantage of this method lies in its ability to automatically generate corresponding complete compositions, thus enhancing the practicality of music style transfer. This research provides new ideas and methods for the field of music style transfer and automatic music generation. Future research directions can further explore transformations between different music styles and apply them to music composition, music education, and other domains, enriching and expanding the possibilities of music creation.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	9	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....