博碩士論文 109522022 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:13 、訪客IP:3.144.251.83
姓名 林季劼(Ji-Jie Lin)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於深度學習之即時電吉他效果器模擬與虛擬實境空氣吉他系統
(A Deep Learning-based Approach for a Black-box Real-time Guitar Amplifier Emulation and a VR-based Air Guitar System)
相關論文
★ 以Q-學習法為基礎之群體智慧演算法及其應用★ 發展遲緩兒童之復健系統研製
★ 從認知風格角度比較教師評量與同儕互評之差異:從英語寫作到遊戲製作★ 基於檢驗數值的糖尿病腎病變預測模型
★ 模糊類神經網路為架構之遙測影像分類器設計★ 複合式群聚演算法
★ 身心障礙者輔具之研製★ 指紋分類器之研究
★ 背光影像補償及色彩減量之研究★ 類神經網路於營利事業所得稅選案之應用
★ 一個新的線上學習系統及其於稅務選案上之應用★ 人眼追蹤系統及其於人機介面之應用
★ 結合群體智慧與自我組織映射圖的資料視覺化研究★ 追瞳系統之研發於身障者之人機介面應用
★ 以類免疫系統為基礎之線上學習類神經模糊系統及其應用★ 基因演算法於語音聲紋解攪拌之應用
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 近年來,虛擬實境的發展已成為大眾關注的焦點。隨著越來越多關於虛擬實境的產品和應用的出現,
虛擬實境設備的效能不斷提升,成本也大幅下降,逐漸成為人手必備的設備。
虛擬實境帶來的沉浸體驗不僅為使用者提供視覺上的享受,還提供了與傳統方式不同的互動方式。
透過虛擬實境技術,使用者可以在虛擬環境中進行各種活動,如遊戲、會議、教育和醫療等。
虛擬實境技術的重要性不斷增加,其中在虛擬樂器領域的應用也越來越廣泛,例如虛擬鋼琴和虛擬爵士鼓等。
這些虛擬樂器的出現不僅讓使用者在虛擬環境中體驗彈奏樂器的樂趣,還降低了使用不同樂器的門檻,
不受地點、時間、空間、設備和技術的限制,只要擁有虛擬實境設備,就能隨時隨地享受彈奏樂器的樂趣。
因此,虛擬音樂會也受到越來越多的重視,如虛擬空間音場模擬和歷史演場會的3D重建等。

然而,在過去的研究中,虛擬吉他多數在非虛擬實境的環境中進行,主要集中在對空氣吉他和弦的識別上。
目前尚未有系統性地研究虛擬實境中的虛擬空氣吉他系統。而在商業化的虛擬吉他遊戲中,
對手部姿勢的識別並不精準,僅能辨識手部是否彎曲和是否刷弦,無法準確辨識和弦和多樣的刷弦動作。
因此,在本研究中,我們提出了一個虛擬空氣吉他系統,使用者只需透過虛擬實境設備即可彈奏吉他。
我們利用深度學習模型的辨識能力和虛擬實境的視覺回饋優勢,能夠辨識高達30種和弦,
並透過搖桿設備實現多種刷弦技巧。

此外,本研究還運用了Black-box方式,透過結合WaveNet和FiLM,
能夠模擬電吉他效果器在不同旋鈕值下的效果模擬,並且透過所提出的Knob Difference Loss,
進一步提高模擬效果的準確率。
在網路架構上,也提出的Kernel Dilation技巧,
在不降低準確度的情況下,將先前研究中使用的WaveNet前饋速度提高了兩倍。
使得在虛擬實境環境中的高效能運算情況下,
能夠使用Intel7 11700 K處理器(於2021年發行)和NVDIA RTX 1060(於2016年發行),
實現即時電吉他效果模擬。
摘要(英) In recent years, the development of virtual reality (VR) has become a focal point of public
attention. With the emergence of more VR products and applications, the performance of VR
devices has continuously improved, and the cost has significantly decreased, gradually
becoming essential devices for individuals. VR provides an immersive experience that
not only offers visual enjoyment to users but also provides interactive modes different
from traditional methods. Through VR technology, users can engage in various activities
in virtual environments, such as gaming, meetings, education, and healthcare.
The significance of VR technology continues to increase, and its applications
in the virtual instrument field, such as virtual pianos and virtual jazz drums,
are becoming increasingly widespread. The emergence of these virtual instruments
not only allows users to experience the pleasure of playing instruments
in virtual environments but also lowers the barrier to learning different instruments.
With VR devices, users can enjoy playing instruments anytime and anywhere,
free from limitations of location, time, space, equipment, and technical expertise.
Consequently, virtual concerts, including virtual spatial audio simulations and 3D
reconstructions of historical performances, have gained more attention.

However, in previous studies, virtual guitars were mostly conducted in non-VR environments,
focusing primarily on the recognition of air guitar chords.
There has been a lack of systematic research on virtual air guitar systems within VR.
Moreover, commercial virtual guitar games currently have limited accuracy
in recognizing hand gestures, only able to detect finger bending and simple strumming actions,
but unable to accurately identify chords and various strumming techniques.
Therefore, in this study, we propose a virtual air guitar system that allows users
to play the guitar simply through VR devices. By leveraging the recognition capabilities
of deep learning models and the visual feedback advantages of VR, our system can recognize
up to 30 different chords and implement various strumming techniques using a joystick device.

Furthermore, we apply a black-box approach by combining WaveNet and FiLM to simulate
the effects of an electric guitar pedal at different knob settings. Additionally,
we introduce a Knob Difference Loss to improve the accuracy of the simulated effects.
In terms of the network architecture, we propose the Kernel Dilation technique,
which doubles the forward speed of WaveNet used in previous studies without sacrificing
accuracy. This enables real-time simulation of electric guitar effects even under
high-performance computing VR environments,
using an Intel 7 11700 K processor (released in 2021)
and an NVIDIA RTX 1060 graphics card (released in 2016).
關鍵字(中) ★ 深度學習
★ 虛擬實境
★ 虛擬樂器
★ 效果器模擬
★ 電腦視覺
★ 音訊處理
關鍵字(英) ★ Deep Learning
★ Virtual Reality
★ Virtual Instrument
★ Guitar Emplifier Emulation
★ Computer Vision
★ Audio Processing
論文目次 摘要 iv
Abstract vi
誌謝 viii
目錄 ix
一、 緒論 1
1.1 研究動機 1
1.2 研究目的 3
1.3 論文架構 4
二、 背景知識以及文獻回顧 5
2.1 背景知識 5
2.1.1 吉他彈奏 5
2.1.2 真空管放大器與電吉他效果器 6
2.1.3 MIDI 事件與 MIDI 播放器 8
2.1.4 3D 手部偵測於虛擬實境 9
2.2 文獻回顧 11
2.2.1 虛擬樂器及空氣吉他相關研究 11
2.2.2 電吉他效果器模擬相關研究 13
三、 研究方法 15
3.1 虛擬電吉他系統 15
3.1.1 系統架構 15
3.1.2 3D 手勢模型碰撞器及吉他指板、刷弦碰撞器 17
3.1.3 3D 和弦手勢模型 18
3.1.4 3D 手勢模型平移投影 19
3.1.5 吉他 MIDI 聲音合成 23
3.2 手勢資料前處理與和弦辨識 24
3.2.1 3D 抬頭旋轉 24
3.2.2 平移原點、旋轉固定 26
3.2.3 關節長度平均化及雜訊 27
3.2.4 手部關節角度轉換 28
3.2.5 和弦辨識與 MLP 模型 29
3.3 音訊資料前處理與資料增加 30
3.3.1 Gaussian Noise 30
3.3.2 BatchNorm1d 30
3.3.3 Extra Sample Window 31
3.4 電吉他效果器模擬損失函數 32
3.4.1 Pre-emphasis Error to Signal Ratio 32
3.4.2 Multi-resolution Short Time Fourier Transform 33
3.4.3 Knob Difference Loss 34
3.5 電吉他效果器模型 36
3.5.1 WaveNet 及 Feed Forward WaveNet 36
3.5.2 模型架構 38
3.5.3 FiLM 40
3.5.4 Dilated Causal Convolution 與 Kernel Dilation 41
3.5.5 History Buffer 43
四、 實驗設計與結果 44
4.1 手勢與效果器資料集 44
4.1.1 手勢資料集和不同測試集 44
4.1.2 效果器資料集 47
4.2 3D 空間和弦辨識 49
4.2.1 和弦辨識 49
4.2.2 不同受試者之和弦辨識 50
4.2.3 3D 抬頭旋轉對和弦辨識穩定之影響 51
4.3 電吉他效果器模擬 52
4.3.1 效果器模擬 52
4.3.2 Knob Difference Loss 損失函數對預測影響 54
4.3.3 History Buffer 對前饋效果之影響 55
五、 總結 56
5.1 結論 56
5.2 未來展望 57
參考文獻 58
參考文獻 [1] M. Yates, A. Kelemen, and C. Sik Lanyi, “Virtual reality gaming in the rehabilitation of the upper extremities post-stroke,” Brain injury, vol. 30, no. 7, pp. 855–863, 2016.
[2] A. H. Sadeghi, A. R. Wahadat, A. Dereci, et al., “Remote multidisciplinary heart team meetings in immersive virtual reality: A first experience during the covid-19 pandemic,” BMJ innovations, vol. 7, no. 2, 2021.
[3] S. Kavanagh, A. Luxton-Reilly, B. Wuensche, and B. Plimmer, “A systematic review of virtual reality in education,” Themes in Science and Technology Education, vol. 10, no. 2, pp. 85–119, 2017.
[4] E. Degli Innocenti, M. Geronazzo, D. Vescovi, et al., “Mobile virtual reality for musical genre learning in primary education,” Computers & Education, vol. 139, pp. 102–117, 2019.
[5] J. Pirker and A. Dengel, “The potential of 360° virtual reality videos and real vr for education—a literature review,” IEEE Computer Graphics and Applications, vol. 41, no. 4, pp. 76–89, 2021.
[6] R. McCloy and R. Stone, “Virtual reality in surgery,” Bmj, vol. 323, no. 7318, pp. 912–915, 2001.
[7] S. Serafin, C. Erkut, J. Kojs, N. C. Nilsson, and R. Nordahl, “Virtual reality musical instruments: State of the art, design principles, and future directions,” Computer Music Journal, vol. 40, no. 3, pp. 22–40, 2016.
[8] A. Broersen and A. Nijholt, “Developing a virtual piano playing environment,” in IEEE International conference on Advanced Learning Technologies (ICALT 2002), 2002, pp. 278–282.
[9] A. Goodwin and R. Green, “Key detection for a virtual piano teacher,” in 2013 28th International Conference on Image and Vision Computing New Zealand (IVCNZ 2013), IEEE, 2013, pp. 282–287.
[10] T. Ishiyama and T. Kitahara, “A prototype of virtual drum performance system with a head-mounted display,” in 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), IEEE, 2019, pp. 990–991.
[11] K. E. Onderdijk, L. Bouckaert, E. Van Dyck, and P.-J. Maes, “Concert experiences in virtual reality environments,” Virtual Reality, pp. 1–14, 2023.
[12] C. Jin, F. Wu, J. Wang, Y. Liu, Z. Guan, and Z. Han, “Metamgc: A music generation framework for concerts in metaverse,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2022, no. 1, pp. 1–15, 2022.
[13] F. K. Brian, D. Poirier-Quinot, and J.-M. Lyzwa, “La vierge 2020: Reconstructing a virtual concert performance through historic auralisation of notre-dame cathedral,” in 2021 Immersive and 3D Audio: from Architecture to Automotive (I3DA), IEEE, 2021, pp. 1–9.
[14] M. Karjalainen, T. Mäki-Patola, A. Kanerva, and A. Huovilainen, “Virtual air guitar,” Journal of the Audio Engineering Society, vol. 54, no. 10, pp. 964–980, 2006.
[15] L. S. Figueiredo, J. M. X. N. Teixeira, A. S. Cavalcanti, V. Teichrieb, and J. Kelner, “An open-source framework for air guitar games,” in 2009 VIII Brazilian Symposium on Games and Digital Entertainment, 2009, pp. 74–82.
[16] T. Ooaku, T. D. Linh, M. Arai, T. Maekawa, and K. Mizutani, “Guitar chord recognition based on finger patterns with deep learning,” in Proceedings of the 4th International Conference on Communication and Information Processing, 2018, pp. 54–57.
[17] 周恒瑋, “基於深度學習之虛擬吉他音樂演奏系統設計,” 2020.
[18] R. S. Armiger and R. J. Vogelstein, “Air-guitar hero: A real-time video game interface for training and evaluation of dexterous upper-extremity neuroprosthetic control algorithms,” in 2008 IEEE Biomedical Circuits and Systems Conference, IEEE, 2008, pp. 121–124.
[19] Anotherway. “Unplugged air guitar.” (2023), [Online]. Available: https://unpluggedairguitar.com/ (visited on 06/15/2023).
[20] L. R. Skreinig, A. Stanescu, S. Mori, et al., “Ar hero: Generating interactive augmented reality guitar tutorials,” in 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), IEEE, 2022, pp. 395–401.
[21] D. Meagher, I. Murphy, M. Mulligan, P. Bolger, A. Leahy, and H. Moss, “Developing an air guitar group for an inpatient psychiatry unit: A pilot study.,” Music and Medicine, 2020.
[22] T. Vanhatalo, P. Legrand, M. Desainte-Catherine, et al., “A review of neural network-based emulation of guitar amplifiers,” Applied Sciences, vol. 12, no. 12, p. 5894, 2022.
[23] C. Steinmetz, J. Reiss, et al., “Efficient neural networks for real-time modeling of analog dynamic range compression,” 2022.
[24] D. Sudholt, A. Wright, C. Erkut, and V. Valimaki, “Pruning deep neural network models of guitar distortion effects,” IEEE/ACM Transactions on Audio, Speech, and Language
Processing, vol. 31, pp. 256–264, 2023.
[25] A. Wright, E.-P. Damskägg, L. Juvela, and V. Välimäki, “Real-time guitar amplifier emulation with deep learning,” Applied Sciences, vol. 10, no. 3, p. 766, Jan. 2020.
[26] P. Bognár, “Audio effect modeling with deep learning methods,” Ph.D. dissertation, Wien, 2022.
[27] E.-P. Damskägg, L. Juvela, E. Thuillier, and V. Välimäki, “Deep learning for tube amplifier emulation,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 471–475.
[28] I. Multimedia. “Amplitube 5 the tone.” (2023), [Online]. Available: https://www.ikmultimedia.com/products/amplitube5/?pkey=amplitube-5%5C#thetone (visited on 06/14/2023).
[29] Wiwi. “原來百分之 66 的流行歌,都是用這 3 種和弦進行?2019 流行歌和弦大調查.” (2019), [Online]. Available: https://youtu.be/zL_14UGziy4 (visited on 06/14/2023).
[30] Fender. “Fender 68 custom princeton reverb.” (2023), [Online]. Available: https://www.muziker.hu/fender-68-custom-princeton-reverb (visited on 06/14/2023).
[31] Musora. “What guitar pedals should you buy? (beginner’s guide).” (2022), [Online]. Available: https://youtu.be/VfYM-wWJNWw (visited on 06/14/2023).
[32] D. Roos. “How making music with midi works.” (2008), [Online]. Available: https://entertainment.howstuffworks.com/midi.htm4 (visited on 06/14/2023).
[33] S. Han, B. Liu, R. Cabezas, et al., “Megatrack: Monochrome egocentric articulated handtracking for virtual reality,” ACM Trans. Graph., vol. 39, no. 4, Aug. 2020.
[34] S. Han, P.-C. Wu, Y. Zhang, et al., “Umetrack: Unified multi-view end-to-end handtracking for vr,” in SIGGRAPH Asia 2022 Conference Papers, ser. SA ’22, Daegu, Republic of Korea: Association for Computing Machinery, 2022.
[35] D. Abdlkarim, M. Di Luca, P. Aves, et al., “A methodological framework to assess the accuracy of virtual reality hand-tracking systems: A case study with the oculus quest 2,” BioRxiv, pp. 2022–02, 2022.
[36] A. Adhikari, T. S. Rao, K. Kar, and T. Dhruw, “Computer vision based virtual musical instruments,” Mathematical Statistician and Engineering Applications, vol. 71, no. 4, pp. 9600–9608, 2022.
[37] J. McGowan, G. Leplâtre, and I. McGregor, “Cymasense: A novel audio-visual therapeutic tool for people on the autism spectrum,” in Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, 2017, pp. 62–71.
[38] J. A. Deja, J. P. Tobias, R. C. Roque, D. David, and K. Chan, “Towards modeling guitar chord fretboard finger positioning using electromyography,” in Proceedings of the 17th Philippine Computing Science Congress, Cebu, Philippines, 2017, pp. 16–18.
[39] V. Nagpurkar, N. Pattankar, T. Nayak, A. D'Souza, and N. Henriques, “Guitarguru: A realtime guitar chords detection system,” in 2023 International Conference on Communication System, Computing and IT Applications (CSCITA), 2023, pp. 107–112.
[40] J. Pakarinen and D. T. Yeh, “A review of digital techniques for modeling vacuum-tube guitar amplifiers,” Computer Music Journal, vol. 33, no. 2, pp. 85–100, 2009.
[41] D. T.-M. Yeh, Digital implementation of musical distortion circuits by analysis and simulation. Stanford University, 2009.
[42] B. Kuznetsov, J. D. Parker, and F. Esqueda, “Differentiable iir filters for machine learning applications,” in Proc. Int. Conf. Digital Audio Effects (eDAFx-20), 2020, pp. 297–303.
[43] J. D. Parker, F. Esqueda, and A. Bergner, “Modelling of nonlinear state-space systems using a deep neural network,” in Proceedings of the International Conference on Digital Audio Effects (DAFx), Birmingham, UK, 2019, pp. 2–6.
[44] J. Wilczek, A. Wright, V. Välimäki, and E. Habets, “Virtual analog modeling of distortion circuits using neural ordinary differential equations,” arXiv preprint arXiv:2205.01897, 2022.
[45] E. Perez, F. Strub, H. De Vries, V. Dumoulin, and A. Courville, “Film: Visual reasoning with a general conditioning layer,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, 2018.
[46] OpenXR. “Openxr hand tracking.” (2023), [Online]. Available: https://mbucchia.github.io/OpenXR-Toolkit/hand-tracking.html (visited on 06/14/2023).
[47] N. Juillerat, S. M. Arisona, and S. Schubiger-Banz, “Enhancing the quality of audio transformations using the multi-scale short-time fourier transform,” in Proceedings of
the 10th IASTED International Conference on Signal and Image Processing, vol. 623, 2008, p. 054.
[48] R. Yamamoto, E. Song, and J.-M. Kim, “Parallel wavegan: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2020, pp. 6199–6203.
[49] W. Jang, D. Lim, J. Yoon, B. Kim, and J. Kim, “Univnet: A neural vocoder with multi-resolution spectrogram discriminators for high-fidelity waveform generation,” arXiv preprint arXiv:2106.07889, 2021.
[50] A. v. d. Oord, S. Dieleman, H. Zen, et al., “Wavenet: A generative model for raw audio,” arXiv preprint arXiv:1609.03499, 2016.
[51] Audacity. “Audacity real-time effect.” (2023), [Online]. Available: https://support.audacityteam.org/audio-editing/using-realtime-effects (visited on 06/15/2023).
[52] GuitarML. “Pedalnetrt: Real-time guitar pedal emulation.” (2020), [Online]. Available: https://github.com/GuitarML/PedalNetRT (visited on 07/11/2023).
指導教授 蘇木春(Mu-Chun Su) 審核日期 2023-7-26
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明