摘要: | 自動音樂生成是指使用計算機算法和人工智能技術來創建音樂的過程,發 展已有悠久的歷史,可以追溯至 20 世紀中葉。過去的研究者使用不同的方法和 技術,從開發規則系統、進化算法,到機器學習和神經網路等技術興起,自動 音樂生成取得了顯著的進展,能夠生成更具創造性和多樣性的音樂作品。另一 方面,數位音訊處理技術的影響使得音樂的分析和轉換變得更加容易和準確。 自動音樂生成在應用方面範圍廣泛,不僅為音樂創作提供了新的可能性和靈感 來源,同時也可以幫助人們節省時間和資源,快速生成符合需求的音樂作品。 Gatys 等人[1]在神經網絡的背景下引入了風格轉換這一術語,通常指的是 保留圖像的明確內容特徵,並將另一張圖像的顯著風格特徵應用於該圖像。音 樂風格轉換通過將不同音樂作品的音樂內容和音樂風格進行分離和重新組合, 產生具有創造性和人工合成特徵的新穎音樂。音樂風格可以指代任何音樂特徵 的不同層次。內容和風格之間的界線是高度動態的,取決於音色、演奏風格或 作曲的不同目標函數,與不同的風格轉換問題有關[2]。 本研究提出了一種利用生成對抗網絡(GAN)[3]框架實現音樂風格轉換的 方法。我們將音樂前處理轉換為 pianoroll 圖像的形式,將音樂視為圖片,並運 用 CycleGAN 模型[4]進行起落音和完整樂曲的風格轉換。這使得使用者只需提 供起落音,即可獲得對應的完整樂曲。在方法實現上,我們不僅使用了深度學習框架,還運用了音樂專業資訊進行資料處理,以進一步分析和優化轉換結果,提升了轉換的質量和實用性。我們還比較了不同生成器和判別器架構在我們的資料集上的表現能力。 該方法的優勢在於能夠自動生成對應的完整樂曲,從而提高了音樂風格轉 換的實用性。這項研究為音樂風格轉換和自動音樂生成領域提供了新的思路和 方法。未來的研究方向可以進一步探索不同音樂風格之間的轉換,並應用於音 樂創作、音樂教育等領域,豐富和拓展音樂創作的可能性。;Automatic music generation refers to the process of creating music using computer algorithms and artificial intelligence techniques. It has a long history and can be traced back to the mid-20th century. Researchers have employed various methods and techniques over the years, ranging from rule-based systems and evolutionary algorithms to the emergence of machine learning and neural networks. Automatic music generation has made significant progress, enabling the generation of more creative and diverse music compositions. Furthermore, the impact of digital audio processing technology has made music analysis and transformation easier and more accurate. Automatic music generation has a wide range of applications. It not only provides new possibilities and sources of inspiration for music composition but also helps save time and resources by quickly generating music compositions that meet specific requirements. In the context of neural networks, Gatys et al. [1] introduced the term "style transfer," which typically refers to preserving the explicit content features of an image and applying the salient style features of another image to it. In the case of music, style transfer involves separating and recombining the musical content and musical style of different music compositions to generate novel music with creative and synthesized characteristics. Music style can refer to different levels of musical features, and the boundary between content and style is highly dynamic, depending on factors such as timbre, performance style, or compositional objectives, which are associated with different style transfer problems [2]. This study proposes a method for music style transfer using the Generative Adversarial Network (GAN) framework [3]. We transform music preprocessing into pianoroll images, treating music as images, and utilize the CycleGAN model [4] for style transfer of musical phrases and complete compositions. This allows users to provide only musical phrases and obtain corresponding complete compositions. In the implementation of the method, we not only employ deep learning frameworks but also utilize domain-specific music knowledge for data processing to further analyze and optimize the transformation results, enhancing the quality and practicality of the conversion. We also compare the performance of different generator and discriminator architectures on our dataset. The advantage of this method lies in its ability to automatically generate corresponding complete compositions, thus enhancing the practicality of music style transfer. This research provides new ideas and methods for the field of music style transfer and automatic music generation. Future research directions can further explore transformations between different music styles and apply them to music composition, music education, and other domains, enriching and expanding the possibilities of music creation. |