摘要(英) |
This study proposes a melody skeleton generation method based on the simulated
annealing algorithm, aiming to develop a method that can randomly generate musically
coherent and meaningful melodic structures while maintaining computational efficiency.
Melody skeletons are a key foundation in the music composition process, providing the
primary melodic framework and rhythmic structure, thus laying the groundwork for the
development of more complex melodic details. This study′s method leverages the advantages
of the simulated annealing algorithm in exploring a vast search space and avoiding local
optima, by understanding relatively important musical elements in melodies and designing
criteria to evaluate the quality of generated melodies.
We provide a detailed explanation of the specific implementation of the simulated
annealing algorithm in this problem, including the representation of the solution space, the
design of the objective function, and the annealing process. To validate the effectiveness of
the proposed method, we conducted similarity comparison experiments with original
compositions, the results of which can be referenced in Appendix 1. The results show that the
simulated annealing algorithm-based method can randomly generate high-quality melodic
skeletons that are consistent with musical prior knowledge and exhibit diversity. Additionally,
we explored the potential applications of this method in various fields, such as automatic
accompaniment generation, style transformation, dataset creation, music theory education
tools, and interactive music generation systems.
In conclusion, this study introduces a robust and innovative method for generating
musical melody skeletons, utilizing the randomness of the simulated annealing algorithm and
iii
the prior knowledge of music theory. In the future, this method has broad application
prospects in automatic accompaniment generation, style transformation, dataset creation,
music theory education tools, and interactive music generation systems. |
參考文獻 |
[1] D.Eck and J.Schmidhuber, “A First Look at Music Composition using LSTM Recurrent Neural Networks. Technical Report, IDSIA/USI-SUPSI,” 2002. [Online]. Available: https://people.idsia.ch/~juergen/blues/IDSIA-07-02.pdf
[2] A.Vaswani et al., “Attention is all you need,” Adv. Neural Inf. Process. Syst., vol. 2017-December, pp. 5999–6009, Jun.2017, [Online]. Available: http://arxiv.org/abs/1706.03762
[3] C. Z. A.Huang et al., “Music transformer: Generating music with long-term structure,” 7th Int. Conf. Learn. Represent. ICLR 2019, pp. 1–15, 2019.
[4] Magenta, Google. https://magenta.tensorflow.org/
[5] Y. S.Huang and Y. H.Yang, “Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions,” in MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, Oct. 2020, pp. 1180–1188. doi: 10.1145/3394171.3413671.
[6] Y.Ren, J.He, X.Tan, T.Qin, Z.Zhao, and T. Y.Liu, “PopMAG: Pop Music Accompaniment Generation,” in MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, Oct. 2020, pp. 1198–1206. doi: 10.1145/3394171.3413721.
[7] Z.Dai, Z.Yang, Y.Yang, J.Carbonell, Q.V.Le, and R.Salakhutdinov, “Transformer-XL: Attentive language models beyond a fixed-length context,” ACL 2019 - 57th Annu. Meet. Assoc. Comput. Linguist. Proc. Conf., pp. 2978–2988, Jan.2020, doi: 10.18653/v1/p19-1285.
[8] B.Yu et al., “Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation,” Adv. Neural Inf. Process. Syst., vol. 35, Oct.2022, [Online]. Available: http://arxiv.org/abs/2210.10349
[9] G.Brunner, Y.Wang, R.Wattenhofer, and S.Zhao, “Symbolic music genre transfer with CycleGAN,” Proc. - Int. Conf. Tools with Artif. Intell. ICTAI, vol. 2018-November, pp. 786–793, Sep.2018, doi: 10.1109/ICTAI.2018.00123.
[10] Z.Hu, Y.Liu, G.Chen, and Y.Liu, “Can Machines Generate Personalized Music? A Hybrid Favorite-Aware Method for User Preference Music Transfer,” IEEE Trans. Multimed., vol. 25, pp. 2296–2308, Jan.2023, doi: 10.1109/TMM.2022.3146002.
[11] C.Hernandez-Olivan and J. R.Beltrán, “Music Composition with Deep Learning: A Review,” Signals Commun. Technol., pp. 25–50, Aug.2023, doi: 10.1007/978-3-031-18444-4_2.
[12] E. R. Hammer and M. S. Cole, Guided Listening: A Textbook for Music Appreciation: Wm. C. Brown, 1992.
[13] Cooper G. W., Meyer L. B. (1963). The Rhythmic Structure of Music. Chicago, IL: University of Chicago Press.
[14] Perricone, J. (2000). Melody in songwriting: Tools and techniques for writing hit songs. Boston, MA: Berklee press.
[15] Lerdahl, F., & Jackendoff, R. (1983). A generative theory of tonal music. Cambridge MA: MIT Press.
[16] H. C.Longuet-Higgins and C. S.Lee, “The Rhythmic Interpretation of Monophonic Music,” 1984. doi: 10.2307/40285271.
[17] Schenker H. (1935), Neue musikalische Theorien und Phantasien, III: Der freie Satz, Universal Edition, Wien; second edition 1956.
[18] E.Chew, “Mathematical and Computational Modelling of Tonality: Theory and Applications,” in International Series in Operations Research and Management Science, vol. 204, Springer New York LLC, 2014, pp. 41–60. doi: 10.1007/978-1-4614-9475-1_3.
[19] S.Kirkpatrick, C. D.Gelatt, and M. P.Vecchi, “Optimization by simulated annealing,” 1983. doi: 10.1126/science.220.4598.671.
[20] R. A.Rutenbar, “Simulated annealing algorithms: An overview,” IEEE Circuits Devices Mag., vol. 5, no. 1, pp. 19–26, 1989, doi: 10.1109/101.17235.
[21] H.Sakoe and S.Chiba, “Dynamic Programming Algorithm Optimization for Spoken Word Recognition,” 1978. doi: 10.1109/TASSP.1978.1163055.
[22] K.Kilgour, M.Zuluaga, D.Roblek, and M.Sharifi, “Fréchet audio distance: A reference-free metric for evaluating music enhancement algorithms,” Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, vol. 2019-September, pp. 2350–2354, 2019, doi: 10.21437/Interspeech.2019-2219. |