合成瑕疵電子元件影像的生成對抗網路

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：62

、訪客IP：18.116.12.116

姓名

謝芯蓉(Sin-Rong Hsieh) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

合成瑕疵電子元件影像的生成對抗網路
(Synthesis of Defect Images for Electronic Components using A Generative Adversarial Network)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-7-26以後開放)

摘要(中)

在電子產業的自動化生產中，將自動光學檢測 (automated optical inspection, AOI) 與深度學習技術做結合，代替傳統人工目視的瑕疵檢測方式，不僅降低人力成本，亦能減少漏檢率並提升檢測速度。對於深度學習系統而言，除了良好的演算法可提升檢測的準確率，訓練資料也是影響網路效能的重要因素。若訓練資料不夠充足，網路權重無法確定，會導致網路能力不佳。收集訓練資料需耗費大量人力，且罕見的瑕疵可取得的樣本數很少，訓練網路時會有資料不平衡的問題。
為了讓深度學習技術能夠更好的應用於自動光學檢測，本研究使用條件式生成對抗網路 (conditional generative adversarial network, CGAN) 將印刷電路板的非瑕疵影像轉換成瑕疵影像，透過複製原有的瑕疵來產生更多瑕疵樣本，增加其他深度學習系統可使用的訓練資料數量，達到類似於影像資料擴增的作用，讓檢測效果更好。
使用的訓練集僅有111組成對影像，其中一張是有瑕疵的樣本，另一張是相同內容但無瑕疵的樣本。訓練時會將資料擴增為八倍；我們以人工標記瑕疵位置，繪製成遮罩作為生成網路的輸入以提供更多資訊。在測試階段改變輸入的遮罩與向量，可使影像中的瑕疵產生對應的變異，亦可改變輸入的非瑕疵影像，讓瑕疵轉移至指定的背景。
以pix2pix網路為基礎架構，考慮到實際應用的方便性，我們減少生成網路的下採樣次數以加快網路的執行速度。生成對抗網路通常需要數萬張訓練影像，否則容易過度擬合 (overfit)，加上訓練過程中兩個網路可能強弱懸殊，與成對影像並非良好對齊的緣故，合成結果常有模糊的現象。針對上述問題，我們提高生成網路相對於判別網路的訓練次數比例，平衡兩者的能力差距，讓訓練更穩定，緩解合成影像的模糊化；此外，我們根據遮罩提供的位置資訊，在計算損失時將瑕疵與背景分開處理，能夠較高程度的保留原始背景的細節，讓合成影像更清晰，此做法可使FID從68.49降為49.27。若以同樣方式拆開計算MAE與MSE，MAE可從5.08降至1.44，MSE從57.00降為4.94。最後加入位置注意力模組 (position attention module, PAM)，讓網路更專注於瑕疵位置的生成，可使MAE、MSE、與FID分別再減少0.02、0.26、與0.25。

摘要(英)

In the automated production of the electronics industry, the combination of automatic optical inspection (AOI) and deep learning technology can replace the traditional manual visual defect inspection method, which not only reduces labor costs, but also reduces the error rate and improves the inspection speed. For deep learning systems, in addition to a good algorithm that can improve the accuracy of inspection, training data is also an important factor affecting network performance. If the training data is not sufficient, the network weight cannot be determined, resulting in poor network capabilities. Collecting training data requires a lot of labor costs. The number of samples that can be obtained for rare defects is small, and there will be data imbalance problems when training the network.
In order to make deep learning technology better applied to automatic optical inspection, this study uses conditional generative adversarial network (CGAN) to convert non-defective images of printed circuit boards into defective images. By duplicating the original defects to generate more defect samples, the amount of training data that can be used by other deep learning systems can be increased, which is similar to image data augmentation, making the inspection effect better.
The training set used consists of only 111 pairs of images, one of which is a defective sample and the other is a non-defective sample with the same content. The data will be expanded by a factor of eight during training. We manually marked the defect locations, drawn as masks as the input of the generator to provide it more information. In the testing phase, by changing the input masks and vectors, we can make the defects in the images change accordingly. We can also change the input non-defective image to move the defect to a specified background.
We use pix2pix as the basic architecture. Considering the convenience of practical application, we reduce the down-sampling times of the generator to speed up the execution speed of the network. Generative adversarial networks usually require tens of thousands of training images, otherwise it is easy to overfit. In addition, the capabilities of the two networks may be very different during the training process, and because the paired images are not well aligned, the synthetic results are often blurred. In response to the above problems, we increase the ratio of the training times of the generator to the discriminator to balance the ability gap between the two networks, so that the training is more stable and the blurring of the synthetic images is alleviated. Furthermore, according to the location information provided by the mask, we process the defect and the background separately when calculating the loss, which can preserve the details of the original background to a higher degree and make the synthetic images clearer. This method can reduce FID from 68.49 to 49.27. If MAE and MSE are calculated in the same way, MAE can be reduced from 5.08 to 1.44, and MSE can be reduced from 57.00 to 4.94. Finally, by adding position attention module, the network can be more focused on the generation of defect locations, which can reduce MAE, MSE, and FID by 0.02, 0.26, and 0.25 respectively.

關鍵字(中)

★ 深度學習
★ 生成對抗網路
★ 瑕疵影像合成

關鍵字(英)

★ deep learning
★ generative adversarial network
★ defect image synthesis

論文目次

摘要 ii
Abstract iv
致謝 vi
目錄 vii
圖目錄 ix
表目錄 xi
第一章緒論 1
1.1 研究動機 1
1.2 系統架構 2
1.3 系統特色 5
1.4 論文架構 5
第二章相關研究 6
2.1 生成對抗網路 6
2.2 風格轉換 9
2.3 影像對影像的轉換 12
2.4 影像資料擴增 15
2.5 注意力機制 16
第三章改進的條件式生成對抗網路 20
3.1 pix2pix網路架構 20
3.2 基於pix2pix架構的修改 22
3.3 生成網路 26
3.4 判別網路 32
3.5 損失函數 32
第四章實驗 34
4.1 實驗設備與開發環境 34
4.2 影像資料集 34
4.3 影像前處理 35
4.4 訓練細節 35
4.5 測試細節 35
4.6 評估準則 37
4.7 實驗結果 41
第五章結論與未來展望 55
參考文獻 56

參考文獻

[1] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv:1411.1784v1.
[2] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” arXiv:1611.07004v3.
[3] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” arXiv:1505.04597v1.
[4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” arXiv:1406.2661v1.
[5] A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, “Generative adversarial networks: An overview,” arXiv:1710.07035v1.
[6] A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier GANs,” arXiv:1610.09585v4.
[7] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv:1511.06434v2.
[8] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,” arXiv:1508.06576v2.
[9] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proc. of the IEEE Conf. on CVPR 2016, Las Vegas, NV, Jun.27-30, 2016, pp.2414-2423.
[10] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556v6.
[11] J. Johnson, A. Alahi, and F.-F. Li, “Perceptual losses for real-time style transfer and super-resolution,” arXiv:1603.08155v1.
[12] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” arXiv:1812.04948v3.
[13] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of StyleGAN,” arXiv:1912.04958v2.
[14] X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” arXiv:1703.06868v2.
[15] A. Karnewar, and O. Wang, “MSG-GAN: Multi-Scale gradients for generative adversarial networks,” arXiv:1903.06048v4.
[16] Y. Pang, J. Lin, T. Qin, and Z. Chen, “Image-to-image translation: Methods and applications,” arXiv:2101.08629v2.
[17] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” arXiv:1411.4038v2.
[18] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” arXiv:1703.10593v7.
[19] L. Kong, C. Lian, D. Huang, Z. Li, Y. Hu, and Q. Zhou, “Breaking the dilemma of medical image-to-image translation,” arXiv:2110.06465v2.
[20] A. Antoniou, A. Storkey, and H. Edwards, “Data augmentation generative adversarial networks,” arXiv:1711.04340v3.
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv:1512.03385v1.
[22] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” arXiv:1608.06993v5.
[23] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” arXiv:1701.07875v3.
[24] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of Wasserstein GANs,” arXiv:1704.00028v3.
[25] T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang, and X. He, “AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks,” arXiv:1711.10485v1.
[26] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.-N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv:1706.03762v5.
[27] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” arXiv:1805.08318v2.
[28] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” arXiv:1709.01507v4.
[29] S. Woo, J. Park, J.-Y. Lee, and I. Kweon, “CBAM: convolutional block attention module,” arXiv:1807.06521v2.
[30] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” arXiv:1809.02983v4.
[31] J.-Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, and E. Shechtman, “Toward multimodal image-to-image translation,” arXiv:1711.11586v4.
[32] S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” arXiv:1502.03167v3.
[33] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance normalization: the missing ingredient for fast stylization,” arXiv:1607.08022v3.
[34] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. of ICML Conf., Haifa, Israel, Jun.21-24, 2010, pp.807-814.
[35] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proc. of ICML Conf., Atlanta, GA, Jun.16-21, 2013, pp.1-6.
[36] X. Mao, Q. Li, H. Xie, R. Y.K. Lau, Z. Wang, and S. P. Smolley, “Least squares generative adversarial networks,” arXiv:1611.04076v3.
[37] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980v9.
[38] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a local Nash equilibrium,” arXiv:1706.08500v6.
[39] M. Tan, and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” arXiv:1905.11946v5.

指導教授

曾定章(Din-Chang Tseng)

審核日期

2022-8-9

推文