影像修補(Image Inpainting)在電腦視覺領域中是一項具有挑戰性的任務,過去的研究大多是基於樣例的方法(Exemplar-based methods)。但隨著人工智慧領域的蓬勃發展,最近的研究發現基於深度學習的方法(Deep-learning-based methods)可以在影像修補上獲得更好的效果。本篇論文提出一個兩階段的生成對抗網路(Generative Adversarial Networks),藉由使用者輸入的影像及遮罩,來執行由粗糙到精細的影像修補。 在網路的第一階段,我們使用區域標準化(Region Normalization)來產生具有正確結構的粗糙模糊結果;在第二階段,我們使用上下文注意機制(Contextual Attention)來利用周圍區域的紋理資訊來產生最終結果。 儘管使用區域標準化可以改善模型的效能和輸出結果的品質,但是可能會出現明顯的色彩偏移問題。為了解決此問題,我們在損失函數使用了感知色彩距離(Perceptual Color Distance)。 最後根據定量實驗結果,本論文提出的方法在Inception Score、Fr?chet Inception Distance及感知色彩距離上,皆優於現有的類似方法。;Image inpainting is a challenging task in computer vision, and most of the previous studies are exemplar-based methods. However, with the vigorous development of artificial intelligence, recent studies have found that deep-learning-based methods can achieve better results on image inpainting. In this thesis, we proposed a two-stage architecture to perform image inpainting from coarse to fine, which uses images and masks input by the user. In the first stage, we apply Region Normalization (RN) to generate coarse blur results with the correct structure. In the second stage, we use Contextual Attention to utilize the texture information of surrounding regions to generate the final results. Although using RN can improve the network′s performance and quality, there may be visible color shifts. To solve this problem, we introduced Perceptual Color Distance into loss function. In quantitative comparison with other similar methods, the method proposed in this thesis is superior to existing similar methods in Inception Score, Fr?chet Inception Distance, and Perceptual Color Distance.