合成瑕疵電子元件影像的生成對抗網路;Synthesis of Defect Images for Electronic Components using A Generative Adversarial Network

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/89914

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89914

题名:	合成瑕疵電子元件影像的生成對抗網路;Synthesis of Defect Images for Electronic Components using A Generative Adversarial Network
作者:	謝芯蓉;Hsieh, Sin-Rong
贡献者:	資訊工程學系
关键词:	深度學習;生成對抗網路;瑕疵影像合成;deep learning;generative adversarial network;defect image synthesis
日期:	2022-08-09
上传时间:	2022-10-04 12:04:45 (UTC+8)
出版者:	國立中央大學
摘要:	在電子產業的自動化生產中，將自動光學檢測 (automated optical inspection, AOI) 與深度學習技術做結合，代替傳統人工目視的瑕疵檢測方式，不僅降低人力成本，亦能減少漏檢率並提升檢測速度。對於深度學習系統而言，除了良好的演算法可提升檢測的準確率，訓練資料也是影響網路效能的重要因素。若訓練資料不夠充足，網路權重無法確定，會導致網路能力不佳。收集訓練資料需耗費大量人力，且罕見的瑕疵可取得的樣本數很少，訓練網路時會有資料不平衡的問題。為了讓深度學習技術能夠更好的應用於自動光學檢測，本研究使用條件式生成對抗網路 (conditional generative adversarial network, CGAN) 將印刷電路板的非瑕疵影像轉換成瑕疵影像，透過複製原有的瑕疵來產生更多瑕疵樣本，增加其他深度學習系統可使用的訓練資料數量，達到類似於影像資料擴增的作用，讓檢測效果更好。使用的訓練集僅有111組成對影像，其中一張是有瑕疵的樣本，另一張是相同內容但無瑕疵的樣本。訓練時會將資料擴增為八倍；我們以人工標記瑕疵位置，繪製成遮罩作為生成網路的輸入以提供更多資訊。在測試階段改變輸入的遮罩與向量，可使影像中的瑕疵產生對應的變異，亦可改變輸入的非瑕疵影像，讓瑕疵轉移至指定的背景。以pix2pix網路為基礎架構，考慮到實際應用的方便性，我們減少生成網路的下採樣次數以加快網路的執行速度。生成對抗網路通常需要數萬張訓練影像，否則容易過度擬合 (overfit)，加上訓練過程中兩個網路可能強弱懸殊，與成對影像並非良好對齊的緣故，合成結果常有模糊的現象。針對上述問題，我們提高生成網路相對於判別網路的訓練次數比例，平衡兩者的能力差距，讓訓練更穩定，緩解合成影像的模糊化；此外，我們根據遮罩提供的位置資訊，在計算損失時將瑕疵與背景分開處理，能夠較高程度的保留原始背景的細節，讓合成影像更清晰，此做法可使FID從68.49降為49.27。若以同樣方式拆開計算MAE與MSE，MAE可從5.08降至1.44，MSE從57.00降為4.94。最後加入位置注意力模組 (position attention module, PAM)，讓網路更專注於瑕疵位置的生成，可使MAE、MSE、與FID分別再減少0.02、0.26、與0.25。;In the automated production of the electronics industry, the combination of automatic optical inspection (AOI) and deep learning technology can replace the traditional manual visual defect inspection method, which not only reduces labor costs, but also reduces the error rate and improves the inspection speed. For deep learning systems, in addition to a good algorithm that can improve the accuracy of inspection, training data is also an important factor affecting network performance. If the training data is not sufficient, the network weight cannot be determined, resulting in poor network capabilities. Collecting training data requires a lot of labor costs. The number of samples that can be obtained for rare defects is small, and there will be data imbalance problems when training the network. In order to make deep learning technology better applied to automatic optical inspection, this study uses conditional generative adversarial network (CGAN) to convert non-defective images of printed circuit boards into defective images. By duplicating the original defects to generate more defect samples, the amount of training data that can be used by other deep learning systems can be increased, which is similar to image data augmentation, making the inspection effect better. The training set used consists of only 111 pairs of images, one of which is a defective sample and the other is a non-defective sample with the same content. The data will be expanded by a factor of eight during training. We manually marked the defect locations, drawn as masks as the input of the generator to provide it more information. In the testing phase, by changing the input masks and vectors, we can make the defects in the images change accordingly. We can also change the input non-defective image to move the defect to a specified background. We use pix2pix as the basic architecture. Considering the convenience of practical application, we reduce the down-sampling times of the generator to speed up the execution speed of the network. Generative adversarial networks usually require tens of thousands of training images, otherwise it is easy to overfit. In addition, the capabilities of the two networks may be very different during the training process, and because the paired images are not well aligned, the synthetic results are often blurred. In response to the above problems, we increase the ratio of the training times of the generator to the discriminator to balance the ability gap between the two networks, so that the training is more stable and the blurring of the synthetic images is alleviated. Furthermore, according to the location information provided by the mask, we process the defect and the background separately when calculating the loss, which can preserve the details of the original background to a higher degree and make the synthetic images clearer. This method can reduce FID from 68.49 to 49.27. If MAE and MSE are calculated in the same way, MAE can be reduced from 5.08 to 1.44, and MSE can be reduced from 57.00 to 4.94. Finally, by adding position attention module, the network can be more focused on the generation of defect locations, which can reduce MAE, MSE, and FID by 0.02, 0.26, and 0.25 respectively.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	37	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....