多尺度特徵融合之姿勢遷移用於自動人像生成;Multi-Scale Feature Fusion on Pose Transfer for Automatic Person Image Generation

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/86593

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/86593

Title:	多尺度特徵融合之姿勢遷移用於自動人像生成;Multi-Scale Feature Fusion on Pose Transfer for Automatic Person Image Generation
Authors:	陳暐恩;Chen, Wei-En
Contributors:	資訊工程學系
Keywords:	姿勢遷移;生成對抗網路;OpenPose;多尺度模型;pose transfer;generative adversarial network;openpose;multi-scale modeling
Date:	2021-08-02
Issue Date:	2021-12-07 13:00:33 (UTC+8)
Publisher:	國立中央大學
Abstract:	人體姿勢轉移已應用於許多領域，例如用於人員重新識別的數據增強、動作識別、視頻合成和視頻編輯。然而，這仍然是一個具有挑戰性的問題，即模型必須具有生成新圖像的能力，同時保持與源圖像相同的體型和服裝，尤其是在源圖像人物姿勢和目標圖像姿勢完全不同的情況下。在潛在空間進行取樣來生成或編輯出全新的影像是目前人工智慧最受歡迎的應用之一，本論文開發出一套姿勢遷移系統，讓機器可以藉由人物圖像與目標姿態生成符合目標姿態的圖像。本篇論文提出了一個多尺度特徵融合姿勢轉移網路架構，融合不同尺度的特徵圖以豐富特徵資訊，並且透過漸進式的方式來彌補人物圖像在遷移時造成的資訊損失。本論文採用 Market-1501 資料集進行訓練以及測試，與之前的工作相比，我們的網絡在客觀定量分數方面表現出優異的性能。有效降低姿勢轉移中背景所帶來的影響，以及生成更細緻的圖像。在未來的研究中，希望可以改善遮蔽物對整體圖像的表現，對於衣服上的細節部分也需再進一步的優化。;Human pose transfer has been applied into many fields, such as data augmentation for person re-identification, action recognition, video synthesis and video editing. However, this still a challenging problem that the model must have the ability to generate a new image while maintaining the same body shape and clothing as the source image, especially in the case the source and target pose are quite different. Sampling in the latent space to create new images or edit existing images is currently one of the most popular applications of AI. This paper has developed a pose transfer system, which can make computer generate character images that matches the target pose automatically. This paper proposes a multi-scale feature fusion pose transfer model architecture, which fuses feature maps of different scales to enrich feature information, and uses a progressive method to compensate for the loss of information caused by the image transfer. This paper uses the Market-1501 dataset for training and testing. Compared with previous work, our network shows excellent performance in objective quantitative scores. Effectively reduce the influence of the background in the pose transfer, and generate more detailed facial images. In future research, it is hoped that the overall image performance of the mask can be improved, and the details of the clothes need to be further optimized.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	44	View/Open

社群 sharing

Loading...