中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98163
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83776/83776 (100%)
Visitors : 59449009      Online Users : 1085
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98163


    Title: 具結構控制與表演情緒之音樂生成研究: 基於預訓練 LLaMA 模型;Music Generation with Structure Control and Performance Expression via Pretrained LLaMA Model
    Authors: 屈俊丞;Chu, Chun-Cheng
    Contributors: 資訊工程學系
    Keywords: Music Generation;Large Language Model;Symbolic Music Representation;Controllable Generation;Logits Masking;Parameter-Efficient Fine-Tuning;音樂生成;大型語言模型;符號音樂表示法;控制性生成;語法遮罩;參數高效微調
    Date: 2025-06-20
    Issue Date: 2025-10-17 12:27:00 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 隨著大型語言模型與生成式模型的快速發展,音樂生成的發展逐漸從音訊合成進入到音樂元素可控的音樂創作形式。現有基於文字條件的生成技術已展現一定水準的音訊合成表現,但在音樂結構控制與真實表演情緒方面,仍有進步空間。
    本研究提出一套以語言模型為基礎的符號音樂生成系統,將音樂視為一種具備文法規則與結構邏輯的語言,針對音樂的段落、節奏、音高、力度與速度等要素進行結構化建模。我們設計專屬音樂 token 詞彙表,並規劃分階段的訓練策略,透過將音樂生成任務拆分為五個子任務(和弦生成、主旋律生成、次旋律生成、力度控制與速度變化生成),以逐步學習並強化模型對音樂結構與表演細節的掌握。
    系統以 LLaMA 3.1 8B-Instruct 模型為基礎,首先進行全參數微調,使模型能理解新增音樂 token 的語法結構與組合規律,建立音樂 token 之間的基礎聯結能力。再針對各子任務分別採用參數高效微調技術中的 LoRA方法,進一步優化特定任務表現,提升訓練效率並保留模型原有知識基礎。
    為提升生成過程中的結構一致性與語法正確性,系統引入結構感知的 logits masking 機制,限制模型在生成過程中僅能選擇符合語法規範的 token,進而強化段落順序、小節邏輯與表演標記的一致性。
    實驗結果顯示,透過本研究所設計的專屬 token 與多階段訓練流程,模型能生成具備完整樂段規劃、節奏連貫與表演情緒控制的樂曲。研究成果證明大型語言模型經過詞彙擴充與結構感知訓練後,能有效應用於符號音樂生成領域。
    ;With the rapid advancement of large language models and generative models, music generation has gradually evolved from raw audio synthesis to a more controllable form of music creation involving structured musical elements. Although existing studies have demonstrated the ability to generate audio conditioned on textual input, significant challenges remain in controlling musical structure and expressive performance.
    This study proposes a symbolic music generation system based on a language model framework, treating music as a language with grammatical rules. We design custom tokens and training strategies to enable the model to learn the compositional logic and structural relationships among musical elements such as sections, rhythm, pitch, dynamics, and tempo.
    Our system builds upon the LLaMA 3.1-8B Instruct model and adopts a multi-stage training strategy. First, full fine-tuning is applied to help the model acquire the grammar of the custom music tokens. Then, parameter-efficient fine-tuning using LoRA is performed across five music generation sub-tasks: chord progression, melody, secondary melody, dynamics, and tempo.
    To enhance structural consistency and output quality, we introduce a structure-aware logits masking mechanism during training, which improves the model’s ability to predict segment transitions, bar continuity, and performance expression tokens.
    Experimental results on a structured symbolic music dataset demonstrate the potential of our model to generate compositions with coherent musical sections and expressive intent. Furthermore, our findings suggest that vocabulary extension and structure-aligned training of large language models can be effectively applied to symbolic music generation tasks.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML8View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明