在本篇研究中,我們結合了深度監督式學習與強化式學習進行符號化音樂的生成。使用深度學習對符號化音樂進行建模的相關任務時,音樂的片段可以被當作沿著時間的符號序列來處理,因此通常會使用具有時間訊息建模能力的模型,如同文本建模或自然語言處理等其他序列建模任務。在這種監督式的方法中,深度神經網路可以自動地從現存的資料庫中抓取音樂性的特徵。然而,音樂的譜曲通常包含一些定義完整的結構和慣用的樂理規則,對聽眾來說較為悅耳。這些約束可以使用強化式學習強加到神經網路中,而單純使用監督式學習的技術較難達成。透過結合這兩種深度學習的主要訓練架構,我們可以讓模型模仿現存資料庫的風格,並且控制生成旋律的特定表現。我們還研究了輸入表式與架構的設計讓模型更容易的抓取音樂的結構特徵。在實驗中,我們主要聚焦在中國江南音樂的單音旋律生成,並驗證生成結果的品質與特性,以及架構中不同模組的有效性。;In this work, we present a symbolic music melody generation method that combines supervised learning and reinforcement learning. For using deep learning in symbolic music modeling tasks, music clips can be processed as sequences of symbols along time, so sequence models with the temporal information modeling ability usually be used, just like other sequential modeling tasks, such as text modeling or natural language processing. In these kind of supervised approaches, deep neural network is able to capture the musical features from the existing dataset automatically. However, music compositions by human composers usually have some well-defined structures and conventional rules of music theory that please the audience. These constraints can be enforced into neural network using reinforcement learning which cannot achieve using supervised learning techniques only. By combining these two major training architectures in deep learning, we can make the model mimic the style of the existing dataset and also control specific behaviors of the generated melody. We also investigate the design of input representation and architecture to make the model capture the music structure feature easier. In the experiments, we focus on monophonic melody generation of Chinese Jiangnan style music, and validate the quality and some characteristics of the generated result, as well as the effectiveness of different modules in the architecture.