論文名稱 使用生成對抗學習的全卷積網路移除影像中的外嵌文字
(Removing Embedded Text in Images via Fully Convolutional Networks with Generative Adversarial Learning)
摘要(中) 影像加上文字是網路上最普遍被使用的媒介之一。舉例來說,網民會製作大量的梗圖 (memes) 使用在許多的目的上。然而在某些情況下,這些外加的文字會破壞影像的美觀而且增加其他應用的難度,像是場景的辨識、物體的分類…等。因此,本研究主要的目標是提出一個能夠自動清除影像中外嵌文字並補全影像的系統。
An image embedded by texts is one of the most common 2D media in the web; for example, the netizen produce lots of this kind pictures or memes for different purposes. In some situations, the added texts make a beauty picture into a garbage. For example, we cannot use the image for some other purposes, such as scene recognition, object classification, …, etc. Therefore, in this study, we aim to propose a system that can clean texts automatically on a given image and inpaint or restore the image.
With novel generation of computer technology, the deep learning architecture can be applied on the inpainting problem and perform better results than several traditional methods. In the proposed system, we construct two modules using the latest and novel deep learning frameworks to get a great result. The first module, mask generation module, is used for detecting the embedded texts in a given image automatically and products the corresponding bitmap image mask. The second module, image completion module, can inpaint the corrupt images based on the given mask image.
In the experiments, we compare our results with two fully developed and without deep learning technique methods. We show that the proposed method can provide more natural and less flawed results than the classic image inpainting methods provided.
關鍵字(中) ★ 影像修復
★ 深度學習
★ 生成對抗網路
關鍵字(英) ★ image inpainting
★ deep learning
★ generative adversarial network
Abstract i
Table of Contents ii
List of Figures iv
List of Tables vi
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 System overview 2
1.3 Thesis organization 4
Chapter 2 Related Works 5
2.1 Image inpainting 5
2.1.1 Diffusion-based methods 6
2.1.2 Examplar-based methods 6
2.1.3 Others 7
2.2 Deep learning 8
2.2.1 Convolutional neural networks 8
2.2.2 Fully convolutional networks 9
2.2.3 Generative adversarial nets 9
Chapter 3 Methods 11
3.1 System overview 11
3.1.1 Mask generation module 12
3.1.2 Image completion module 14
3.1.3 Overall Architecture 16
3.2 Training 17
3.2.1 Loss functions 17
3.2.2 Learning algorithm 18
Chapter 4 Experiments 20
4.1 Dataset 20
4.1.1 Build training dataset 20
4.1.2 Preprocessing 22
4.2 Environment setting 22
4.3 Results 23
4.3.1 Results on mask generation module 23
4.3.2 Results on image completion module 26
Chapter 5 Evaluation and Comparison 29
Chapter 6 Conclusion and Future Works 34
References 35
