樂器表演虛擬換衣系統：以吉他為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：52

、訪客IP：3.144.108.113

姓名

劉起華(Liu-Chi Hua) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

樂器表演虛擬換衣系統：以吉他為例
(Instruments perform Virtual try-on system: using Guitar as an Example)

相關論文

★ 基於edX線上討論板社交關係之分組機制	★ 利用Kinect建置3D視覺化之Facebook互動系統
★ 利用 Kinect建置智慧型教室之評量系統	★ 基於行動裝置應用之智慧型都會區路徑規劃機制
★ 基於分析關鍵動量相關性之動態紋理轉換	★ 基於保護影像中直線結構的細縫裁減系統
★ 建基於開放式網路社群學習環境之社群推薦機制	★ 英語作為外語的互動式情境學習環境之系統設計
★ 基於膚色保存之情感色彩轉換機制	★ 一個用於虛擬鍵盤之手勢識別框架
★ 分數冪次型灰色生成預測模型誤差分析暨電腦工具箱之研發	★ 使用慣性傳感器構建即時人體骨架動作
★ 基於多台攝影機即時三維建模	★ 基於互補度與社群網路分析於基因演算法之分組機制
★ 即時手部追蹤之虛擬樂器演奏系統	★ 基於類神經網路之即時虛擬樂器演奏系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-6-30以後開放)

摘要(中)

深度學習已應用在多個領域之中，在圖像領域上更是取代了多個傳統技術，虛擬穿衣(Virtual try-on)便是在該領域上的重要分支，2D應用上常用在商業成衣產業線上試衣，減少消費者到現場的成本，在3D應用中，則會生成一個3D的人體模型，在模型上進行試穿，輸出結果上比2D方法更加穩定，但須繁雜的預處理，如3D掃描等。在本文中，我們提出一個樂器表演換衣系統，可以讓使用者在改變在樂器表演影片中的外衣，可用於短影音娛樂與其他使用者分享，不需重新換裝。此系統使用深度學習的人體分割技術，以SCHP與DensePose作為主要人體分割模型，將人體與衣著各部位傳遞給穿衣模型。因人體分割無法表示被遮蔽的人體，使用OpenPose作為人體與手部骨架系統填補缺失的人體資訊。HR-VITON作為主要穿衣深度模型，利用人體分割與人體骨架的資訊，產生合理的穿衣結果。因HR-VITON為了提高模型泛化能力，會將部分影像挖空作為輸入，來讓結果在邊界的處理更加平滑，但這會影響樂器的還原能力，本文調整了影像挖空的演算法，讓樂器仍保良好的還原效果，讓使用者能在已有表演影像的前提下，做出更多不同的變化。

摘要(英)

Deep learning has been applied in various domains and has superseded several traditional techniques in the field of image processing. Virtual try-on is an important sub-domain in this field. In 2D applications, Virtual try-on is often used in the online fitting of commercial garments, reducing the cost for consumers to visit physical locations. In 3D applications, a 3D human body model is generated and used for fitting, resulting in more stable outputs than 2D methods, but requiring complex preprocessing such as 3D scanning. In this paper, we propose an instrument performance virtual try-on system that allows users to change their clothing in instrument performance videos, suitable for sharing short video entertainment with other users without changing clothes. The system uses deep learning-based human body segmentation techniques, with SCHP and DensePose as the main body segmentation models, and transfers body and clothing parts to the Virtual try-on model. Since human body segmentation cannot represent occluded body parts, OpenPose is used as a body and hand skeleton system to supplement missing body information. HR-VITON serves as the main Virtual try-on model, using body segmentation and body skeleton information to generate reasonable fitting results. To improve the model′s generalization ability, HR-VITON introduces a hole digging technique for input images to achieve smoother boundary handling. However, this negatively affects the instrument restoration capability. In this study, we adjust the hole digging algorithm to maintain good instrument restoration effects, enabling users to create various changes based on existing performance videos.

關鍵字(中)

★ 虛擬換衣
★ 人體分割
★ 深度學習

關鍵字(英)

★ Virtual try-on
★ Human Parsing
★ Deep learning

論文目次

摘要 i
Abstract ii
Content iv
1. Introduction 1
1.1. Background 1
1.2. Motivation 2
2. Related Works 3
2.1. Virtual try-on 3
2.2. Human parser 3
2.3. Human parser based Try-on model 4
2.3.1. VITON 5
2.3.2. ACGPN 8
3. Primary Research 12
3.1. SCHP 12
3.2. OpenPose 14
3.3. DensePose 17
3.4. HR-VITON 20
4. Methodology 25
4.1. Model structure 25
4.1.1. Background recover 26
4.1.2. Agnostic without guitar 26
5. Experiment 28
5.1. Experiment Setup 28
5.1.1. Hardware and software environment 28
5.1.2. Training detail 29
5.2. Experiment results 29
5.3. Ablation study 31
6. Conclusion 35
7. Reference 36

參考文獻

[1] Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, Larry S. Davis University of Maryland, College Park. “VITON: An Image-based Virtual Try-on Network” Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[2] S. Belongie, J. Malik, J. Puzicha. “Shape matching and object recognition using shape contexts” IEEE TPAMI, 2002
[3] O. Ronneberger, P. Fischer, T. Brox. “U-net: Convolutional networks for biomedical image segmentation” In MIC-CAI, 2015
[4] Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, Wangmeng Zuo, Ping Luo. “Towards Photo-Realistic Virtual Try-On by Adaptively Generating ↔ Preserving Image Content” Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[5] Bochao Wang, Huabin Zheng, Xiaodan Liang, Yimin Chen, Liang Lin, Meng Yang. “Toward Characteristic-Preserving Image-based Virtual Try-On Network” European Conference on Computer Vision (ECCV), 2018
[6] Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu. “Spatial Transformer Networks” Conference on Neural Information Processing Systems(NIPS), 2015
[7] Peike Li, Yunqiu Xu, Yunchao Wei, Yi Yang. “Self-Correction for Human Parsing” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022
[8] Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao Zhao. “Devil in the Details: Towards Accurate Single and Multiple Human Parsing” AAAI Conference on Artificial Intelligence, 2019
[9] Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, Liang Lin. “Instance-level Human Parsing via Part Grouping Network” European Conference on Computer Vision(ECCV), 2018
[10] Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, Yaser Sheikh. “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021
[11] Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos. “DensePose: Dense Human Pose Estimation In The Wild” Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[12] Sangyun Lee, Gyojung Gu, Sunghyun Park, Seunghwan Choi, Jaegul Choo. “High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions” European Conference on Computer Vision (ECCV), 2022
[13] Dong-Hyun Lee. “Pseudo-Label：The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks” International Conference on Machine Learning(ICML), 2013
[14] Samuli Laine, Timo Aila. “Temporal Ensembling for Semi-Supervised Learning” International Conference on Learning Representations(ICLR), 2017
[15] Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár. “Microsoft COCO: Common Objects in Context” www.arxiv.org, 2014
[17] Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, Michael J. Black. “SMPL, A Skinned Multi-Person Linear Model” https://smpl.is.tue.mpg.de/index.html, 2020
[18] Rıza Alp Güler, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos. “DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild” www.arxiv.org, 2016
[19] Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu. “Semantic Image Synthesis with Spatially-Adaptive Normalization”, www.arxiv.org, 2019
[20] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro. “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs”, www.arxiv.org, 2017
[21] GoGoDuck912. “Self-Correction-Human-Parsing”, https://github.com/GoGoDuck912/Self-Correction-Human-Parsing, 2021
[22] Google Research, “Open Images Dataset V7 and Extensions”, https://storage.googleapis.com/openimages/web/index.html, 2022
[23] OpenMMLab. “mmdetection”, https://github.com/open-mmlab/mmdetection, 2023
[24] Bharat Lal Bhatnagar, Garvita Tiwari, Christian Theobalt and Gerard Pons-Moll, “Multi-Garment Net: Learning to Dress 3D People from Images”, International Conference on Computer Vision(ICCV), 2019
[25] Fuwei Zhao, Zhenyu Xie, Michael Kampffmeyer, Haoye Dong, Songfang Han, Tianxiang Zheng, Tao Zhang, Xiaodan Liang, “M3D-VTON: A Monocular-to-3D Virtual Try-On Network”, International Conference on Computer Vision(ICCV), 2021

指導教授

施國琛(Timothy K. Shih)

審核日期

2023-7-11

推文