以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:48 、訪客IP:18.227.140.100
姓名 郭祐昇(You-Sheng Guo) 查詢紙本館藏 畢業系所 通訊工程學系 論文名稱 具有注意力機制之隱式表示於影像重建三維人體模型
(Implicit Representation with Attention Mechanism for Image Reconstruction of 3D Human Model)相關論文
★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術 ★ 基於弱監督式學習可變形模型之三維人臉重建 檔案 [Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 ( 永不開放) 摘要(中) 近年人工智慧的發展迅速,各個產業紛紛透過機器取代或輔助人力,降低生產成本。在遊戲的領域中, 為了使人物的自然度更貼近現實生活,遊戲開發者需與動畫設計師共同研發 三維 人體模型 ,但是花費的時間與金錢過 高,提高開發成本,於是以 深度學習 去研發 三維 人體模型而不需要掃瞄儀器的輔助可以大幅降低遊戲開發成本。本研究將單張影像重建三維人體模型並以深度學習方式進行訓練 ,且在少量的資料集中達到高品質的重建。 近期文獻都以大量的資料進行訓練,不僅花費大量時間與提高購買訓練資料的成本,且無法供應個人使用。為了配合少量資料集進行模型訓練,本研究調整網路架構,使其能適應低資料庫訓練 ,可以確保非公司企業之個人使用該 三維 人體模型。 模型加入注意力機制使其在訓練時提取重要的特徵 提高重建三維人體模型的品質以及減少參數更新的時間, 另外,重建的模型不單只有幾何(Geometry)而是有顏色上的表現,能應用更廣泛。 本研究 不管是在客觀的評估(Point to Surface、 Chamfer Distance)或者重建 三維 人體模型的評估,兩者都有傑出的表現。
關鍵字: 重建三維人體模型、注意力機制、深度學習摘要(英) With the rapid development of artificial intelligence in recent years, various industries have been replacing or aiding manpower through machines to reduce production costs. In order to make the naturalness of the characters closer to the real life, game developers need to develop 3D human models together with animation designers, but the time and money spent are too high, which increases the development cost. Therefore, using deep learning to develop 3D human models without the assistance of scanning instruments can significantly reduce game development costs. In the research, the 3D human model is reconstructed from a single image and trained with deep learning to achieve a high quality reconstruction with a small dataset. Recent literature has trained with a large amount of data, which not only takes a lot of time and increases the cost of purchasing training materials, but is also not available for personal use. In order to train the model with a small number of datasets, this study adapted the network architecture to accommodate low database training, which can ensure the use of the 3D human model by individuals in non- corporate enterprises. The addition of Attention to the model allows it to extract important features during training, improving the quality of the reconstructed 3D human model and reducing the time it takes to update parameters. In addition, the reconstructed model has not only geometry but also color representation, which can be used in a wider range of applications. Both have outstanding performance in objective evaluation or evaluation of reconstructed 3D human models. 關鍵字(中) ★ 重建 三維 人體模型
★ 注意力機制
★ 深度學習關鍵字(英) ★ Reconstruction of 3D Human Body Model
★ Attention Mechanism
★ Deep Learning論文目次 1. 緒論 1
1-1 研究背景 1
1-2 研究動機與目的 2
1-3 論文架構 3
2. 重建三維人體模型技術背景 4
2-1 三維人體模型表現方式 5
2-1-1 體素 5
2-1-2 點雲 5
2-1-3 多邊形網格 5
2-1-4 佔有函數 6
2-1-5 符號距離函式 6
2-2 硬體設備之重建三維人體模型 7
2-2-1 掃描亭 7
2-2-2 手持式人體掃描儀 8
2-2-3 行動裝置掃描應用軟體 9
2-3 軟體開發之重建三維人體模型 10
2-3-1 SMPL + Deformation + Texture 11
2-3-2 Body Estimation + canonical + Occupancy 12
2-3-3 Occupancy + RGB 13
3. 人工智慧 14
3-1 機器學習的分類 15
3-1-1 監督式學習 vs 非監督式學習 16
3-1-2 半監督式學習 17
3-1-3 強化式學習 17
3-1-4 遷移學習 17
3-2 深度學習 18
v
3-2-1 神經元神經元 ............................................................................................................................................................................ 18
3-2-2 激活函數激活函數 .................................................................................................................................................................... 19
3-2-3 卷積卷積 .................................................................................................................................................................................. 21
3-2-4 殘差網路殘差網路 .................................................................................................................................................................... 24
3-2-5 端到端端到端 ............................................................................................................................................................................ 25
3-2-6 模型採樣模型採樣 .................................................................................................................................................................... 26
3-2-7 注意力機制注意力機制 ............................................................................................................................................................ 27
4. 文獻回顧文獻回顧.......................................................................................................................................................................................................... 28
5. 實驗架構與設計實驗架構與設計 .................................................................................................................................................................................. 30
5-1 端到端架構端到端架構 ................................................................................................................................................................................ 30
5-2 預測預測Geometry重建之網路架構重建之網路架構 .............................................................................................................. 31
5-3 預測預測RGB 重建之網路架構重建之網路架構 .......................................................................................................................... 34
5-4 損失函數損失函數 ........................................................................................................................................................................................ 35
6. 實驗結果與分析實驗結果與分析 .................................................................................................................................................................................. 36
6-1 環境設定與參數配置環境設定與參數配置 ................................................................................................................................................ 36
6-2 數據集數據集 .............................................................................................................................................................................................. 37
6-3 評估評估 ...................................................................................................................................................................................................... 42
6-3-1 Point to Surface .................................................................................................................................................. 42
6-3-2 Chamfer Distance ............................................................................................................................................ 43
6-3-3 平均與標準差平均與標準差 .................................................................................................................................................... 45
6-4 實驗結果比較與分析實驗結果比較與分析 ................................................................................................................................................ 46
6-4-1 調整網路架構調整網路架構 .................................................................................................................................................... 46
6-4-2 注意力機制注意力機制 ............................................................................................................................................................ 51
7. 結論與未來與展望結論與未來與展望 .......................................................................................................................................................................... 54
參考資料
參考資料 .................................................................................................................................................................................................................. 55參考文獻 [1] Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black, “SMPL: A Skinned Multi-Person Linear Model,” ACM Transactions on Graphics, vol. 34, Issue 6, no. 248, pp. 1-16, November 2015.
[2] Julian Chibane, Thiemo Alldieck, and Gerard Pons-Moll, “Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[3] JAVIER ROMERO, DIMITRIOS TZIONAS, and MICHAEL J. BLACK, “Embodied Hands: Modeling and Capturing Hands and Bodies Together,” SIGGRAPH ASIA 2017, ACM Transactions on Graphics, vol. 36, no. 6, Article 245, November 2017.
[4] Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani1, Timo Bolkart1, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black, “Expressive Body Capture: 3D Hands, Face, and Body from a Single Image,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10975-10985, 2019.
[5] Ahmed A. A. Osman, Timo Bolkart, and Michael J. Black, “STAR: Sparse Trained Articulated Human Body Regressor,” Computer Vision – ECCV 2020, pp. 598-613, 2020.
[6] Thiemo Alldieck, Gerard Pons-Moll, Christian Theobalt, and Marcus Magnor, “Tex2Shape: Detailed Full Human Body Geometry From a Single Image,” IEEE, International Conference on Computer Vision, 2019.
[7] K.Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, ” Biological cybernetics,vol.36,no.4,pp.193-202,1980
[8] Y.Lecun, L.Bottou, Y.Bengio and P.Haffner, “gradient based learning applied to document recognition, ” In proceedings of the IEEE,vol.86,no.11,pp.2278-2324,Nov.1998.
[9] F. Bogo, A. Kanazawa, C. Lassner, P. Gehler, J. Romero, and M. J. Black Keep it, “ SMPL: Automatic estimation of 3D human pose and shape from a single image, ” In European Conference on Computer Vision, pages 561–578, 2016.
[10] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation, ” arXiv preprint arXiv:1710.10196, 2017.
[11] A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “Endto-end recovery of human shape and pose, ” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7122–7131, 2018.
[12] D. Xiang, H. Joo, and Y. Sheikh, “Monocular total capture: Posing face, body, and hands in the wild, ” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10965–10974, 2019.
[13] G. Pavlakos, V. Choutas, N. Ghorbani, T. Bolkart, A. A. Osman, D. Tzionas, and M. J. Black, “Expressive body capture: 3D hands, face, and body from a single image, ” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10975–10985, 2019.
[14] Y. Xu, S.-C. Zhu, and T. Tung, “Denserac: Joint 3D pose and shape estimation by dense render-and-compare., ” In Proceedings of the IEEE International Conference on Computer Vision, pages 7760–7770, 2019.
[15] Tong He, John Collomosse, Hailin Jin, and Stefano Soatto, “Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction,” Conference on Neural Information Processing Systems (NeurIPS), 2020.
[16] Yang Hong, Juyong Zhang1, Boyi Jiang, Yudong Guo, Ligang Liu, and Hujun Bao, “StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision,” IEEE/CVF, Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[17] Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma4 Hao Li, and Shigeo Morishima, “SiCloPe: Silhouette-Based Clothed People,” IEEE Computer Society, Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition,CVPR 2019, pp. 4475-4485, Jun 2019.
[18] Zeng Huang, Yuanlu Xu, Christoph Lassner, Hao Li, and Tony Tung, “ARCH: Animatable Reconstruction of Clothed Humans,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[19] Shunsuke Saito, Jinlong Yang, Qianli Ma, and Michael J. Black, “SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks,” Proceedings IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2021.
[20] Boyi Jiang, Juyong Zhang, Yang Hong, Jinhao Luo, Ligang Liu, and Hujun Bao, “BCNet: Learning Body and Cloth Shape from A Single Image,” Computer Vision – ECCV 2020: 16th European Conference, Proceedings, pp. 18–35, August 2020.
[21] Igor Santesteban, Miguel A. Otaduy, and Dan Casas, “Learning-Based Animation of Clothing for Virtual Try-On,” Computer Graphics Forum, vol. 38, Issue 2, pp. 355-366, May 2019.
[22] Bharat Lal Bhatnagar, Garvita Tiwari, Christian Theobalt, and Gerard Pons-Moll, “Multi-Garment Net: Learning to Dress 3D People from Images,” IEEE, International Conference on Computer Vision (ICCV), October 2019.
[23] Enric Corona, Albert Pumarola, Guillem Alenya, Gerard Pons-Moll, and Francesc Moreno-Noguer, “SMPLicit: Topology-aware Generative Model for Clothed People,” In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
[24] Chaitanya Patel, Zhouyingcheng Liao, and Gerard Pons-Moll, “TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style,” IEEE, Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[25] Verica Lazova, Eldar Insafutdinov, and Gerard Pons-Moll, “360-Degree Textures of People in Clothing from a Single Image,” IEEE, 2019 International Conference on 3D Vision, September 2019.
[26] Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li, “PIFu: Pixel-Aligned Implicit Function
for High-Resolution Clothed Human Digitization,” The IEEE International Conference on Computer Vision (ICCV), October 2019.
[27] Shunsuke Saito, Tomas Simon, Jason Saragih, and Hanbyul Joo, “PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization,” IEEE, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[28] A. Newell, K. Yang, and J. Deng, “Stacked hourglass networks for human pose estimation,” In European Conference on Computer Vision, pages 483–499, 2016.
[29] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” In IEEE International Conference on Computer Vision, pages 2223–2232, 2017.
[30] William E Lorensen and Harvey E Cline, “Marching cubes: A high resolution 3D surface construction algorithm,” In ACM siggraph computer graphics, volume 21, pages 163–169, ACM, 1987.
[31] Svetlana Golubeva. 三維 人体扫描完整手册 。 2020年 9月 8日 。檢
自 https://www.artec三維 .cn/learning-center/三維 -body-scanner
[32] People′s Architecture Office. 3D Copypod / People′s Architecture Office. June 08, 2017. https://www.archdaily.com/photographer/people-s-industrial-design-office-pido
[33] SMPL Human Model Introduction. https://khanhha.github.io/posts/SMPL-model-introduction/
[34] rowan.ts. 使用 PyTorch 提供的預訓練模型 (Pretrained Model) 做物
件偵測 (Object Detection)。 2020年 6月 20日 。檢自
https://rowantseng.medium.com/%E4%BD%BF%E7%94%A8-pytorch-%E6%8F%90%E4%BE%9B%E7%9A%84%E9%A0%90%E8%A8%93%E7%B7%B4%E6%A8%A1%E5%9E%8B-pretrained-model-%E5%81%9A%E7%89%A9%E4%BB%B6%E5%81%B5%E6%B8%AC-object-detection-57ad9772a982
[35] 壹讀。 改變世界的七大 NLP技術,你了解多少?(上) 。 2018年 6
59
月
月21日日。檢自。檢自:: https://read01.com/7DMgJjO.html#.Ysf_5HZByUl
[36] 波吉殿下的小跟班波吉殿下的小跟班。。《《PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization》笔记》笔记。。2022年年2月月12日日。。檢自檢自::https://zhuanlan.zhihu.com/p/466626859指導教授 張寶基 陳永芳(Pao-Chi Chang Yung-Fang Chen) 審核日期 2022-8-4 推文 facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu 網路書籤 Google bookmarks del.icio.us hemidemi myshare