在電子元件影像上複製可控制變異瑕疵的深度學習系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：28

、訪客IP：3.149.242.16

姓名

范仲瑜(Zhong-Yu Fan) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

在電子元件影像上複製可控制變異瑕疵的深度學習系統
(Deep learning system for reproducing variation-controllable defects on electronic-component images)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-31以後開放)

摘要(中)

深度學習 (deep learning) 技術已廣泛應用在各項領域中；近年來許
多行業都引入此項技術來提升作業上的效率及精密度，特別是與影像結
合應用的部份。在深度學習的訓練策略中，最為理想的方式就是監督式學
習，但現實生活中資料集的正負樣本的比例經常不平衡，或是資料量不充
足，導致監督式學習的效果不佳。早期人們會使用影像翻轉、旋轉等資料
擴增的方式來填補資料不充足的問題，但是此作法容易造就出真實情況
中不會發生或不合理的資料集，反而會誤導網路模型的學習方向。為了讓
資料擴增能夠更貼近現實資料集的狀況，生成模型 (generative model) 將會是一項重要的技術之一。

在瑕疵檢測任務當中，經常發生瑕疵樣本的資料量不足，所以通常網
路模型會使用半監督式或非監督式學習的方法來進行訓練。因此我們將
針對如何轉移瑕疵樣本進行相關研究，提出了瑕疵影像轉移的系統，在轉
移瑕疵影像時，可以針對影像中的瑕疵部位進行調整。

我們的網路模型改自於本實驗室優化過的 pix2pix 條件式生成對抗網
路，為了更好掌控轉移影像的瑕疵部位，我們重新定義輸入端的條件向
量，此條件向量第一維度的數值將會影響到瑕疵部位顏色亮暗的變化。我
們所提出的瑕疵轉移網路 (defect reproducing GAN, DRPGAN)，主要的改進有：i. 為訓練階段條件向量的第一維度設計瑕疵區域亮暗變化演算法，在測試階段可以透過此條件向量第一維度的數值去掌控影像中瑕疵區域
的亮暗變化；ii. 為測試階段條件向量的第一維度設計預設值演算法。
實驗中我們主要使用鍵盤按鍵的影像來進行訓練與測試，鍵盤按鍵
的影像共有 218 組影像，每組影像擁有非瑕疵影像、瑕疵影像與瑕疵遮
罩，資料集裡的非瑕疵影像均透過人工影像編輯修復瑕疵影像所得。為了
維持轉移影像整體的品質，我們的資料集不區分訓練集與測試集，將所有
資料一同進行訓練，在測試階段僅專注在瑕疵區域上的變化。在鍵盤按鍵
的影像當中，我們可以透過調整條件向量第一維度的數值達到控制影像
中瑕疵區域的亮暗變化，再搭配瑕疵遮罩來提供位置與形狀，將期望的瑕
疵轉移到非瑕疵影像上。

針對自行設計條件向量第一維度的數值演算法，我們額外蒐集了不
同類型的影像資料集進行訓練與測試。新類型資料集擁有 301 組影像，
每組影像擁有非瑕疵影像、瑕疵影像、瑕疵遮罩，其非瑕疵影像是與瑕疵
影像極其相似的背景。在測試結果中，我們可以觀察到不管是新資料集還
是舊資料集，條件向量第一維度數值的調整皆能夠明確的調整瑕疵區域
的亮暗變化。

最後，我們將這些瑕疵轉移影像提供給現有的辨識器 EfficientNet-b0
作為訓練樣本，讓辨識器在測試階段時，能夠將七至八成的真實瑕疵樣本
進行成功的分類，證實我們瑕疵轉移網路所轉移的瑕疵影像，可以作為瑕
疵影像資料擴增的來源之一。

摘要(英)

“Deep learning” is widely used in various application fields. In recent years, many industries have introduced this technology to improve efficiency and accuracy in operations, especially in the part of combining with images. In deep learning training strategies, the most ideal approach is supervised learning. However, in real life, the ratio of positive and negative samples in datasets is often unbalanced or the amount of data is insufficient, resulting in poor performance of supervised learning. In the early days, people used image flipping and rotation to expand data to solve the problem of insufficient data. However, this method is easy to create data sets that will not occur or are unreasonable in real situations, which will mislead the learning direction of network models. In order to make data expansion more close to the situation of real data sets, generative models will be one of the important technologies.

In the task of defect detection, it often happens that there is not enough data for defective samples. Therefore, network models usually use semisupervised or unsupervised learning methods for training. Therefore, we will focus on how to reproduce defective samples and propose a system for reproducing defective images. When reproducing images, we can adjust the defective parts of images.

Our network model is adapted from the optimized pix2pix conditional generative adversarial network in our laboratory. In order to better control the defective parts of the transferred image, we redefine the input conditional vector. The value of the first dimension of this conditional vector will affect the change in brightness of the defective part. The main improvements of our proposed defect transfer network (defect reproducing GAN, DRPGAN) are: i. The design of the brightness change algorithm for the first dimension of the conditional vector during the training phase, which can control the brightness change of the defective area in the image through the value of the first dimension of this conditional vector during the testing phase; ii. The design of a default value algorithm for the first dimension of the conditional vector during the testing phase.

In our experiment, we mainly use keyboard image to train and test. There are 218 groups of keyboard images in total. Each group has non-defective images, defective images and defective masks. Non-defective images in dataset are obtained by manual image editing and repairing defective images. In order to maintain the overall quality of reproduced images, our dataset does not distinguish between training set and test set. All data are trained together and only focus on changes in defective areas during testing phase. In keyboard images, we can adjust the numerical value of the first dimension of condition vector to control brightness change of defective areas in images and use defect masks to provide location and shape to transfer expected defects to nondefective images.

For self-designed numerical value algorithm for condition vector’s first dimension, we also collected different types of image datasets for training and testing. The new type dataset has 301 groups of images with non-defective images similar to defective ones as background. In test results, we can observe that whether it is a new dataset or an old one, adjusting numerical value of first dimension can clearly adjust brightness change of defective areas.

Finally, we provide these reproduced defect images to existing recognizer EfficientNet-b0 as training samples so that recognizer can successfully classify 70-80% real defect samples during testing phase. This confirms that defect images reproduced by our defect transfer network can be used as one source for expanding defect image datasets.

關鍵字(中)

★ 瑕疵轉移
★ 可控制變異瑕疵

關鍵字(英)

論文目次

目錄
摘要 ............................................ ii
Abstract ........................................ iv
致謝 ............................................. vi
目錄 ............................................. vii
圖目錄 ........................................... ix
表目錄 ........................................ xi
第一章緒論 ................................ 1
1.1 研究動機與目的 ......................... 1
1.2 系統架構 ........................ 2
1.3 系統特色 ......................... 5
1.4 論文架構 ......................... 5
第二章相關研究 ....................... 6
2.1 深度學習的生成對抗網路 ................... 6
2.2 注意力機制 ..................... 16
第三章改進的條件式生成對抗網路 .......... 20
3.1 本實驗室曾優化過的 pix2pix 網路架構 ............ 20
3.2 瑕疵轉移網路 DRPGAN ............................... 24
3.3 自適應調整的條件向量 ............................ 28
第四章實驗 ................................ 35
4.1 實驗設備與開發環境 ...... 35
4.2 影像資料集 ..................... 35
4.3 資料集前處理 .................. 36
4.4 實驗細節 ....................... 39
4.5 評估準則 ....................... 40
4.6 實驗結果 ..................................... 41
第五章結論與未來展望 ................... 66
參考文獻 ..... 67

參考文獻

[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol.521, no.7553, pp.436-444, 2015.
[2] M. Zhang, J. Wu, H. Lin, P. Yuan, and Y. Song, “The application of oneclass classifier based on CNN in image defect detection,” Procedia Computer Science, vol.114, pp.341-348, 2017.
[3] R. Chalapathy and S. Chawla, “Deep learning for anomaly detection: a survey,” arXiv:1901.03407.
[4] M. Rudolph, B. Wandt, and B. Rosenhahn, “Same same but differnet: semi-supervised defect detection with normalizing flows,”
arXiv:2008.12577v1.
[5] Y. Teng, H. Li, F. Cai, M. Shao, and S. Xia, “Unsupervised visual defect detection with score-based generative model,” arXiv:2211.16092.
[6] C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data, vol.6, no.1, pp.1-48, 2019.
[7] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv:1411.1784v1.
[8] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” arXiv:1611.07004v3.
[9] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” arXiv:1505.04597v1.
[10] I. J. Myung, “Tutorial on maximum likelihood estimation,” Journal of Mathematical Psychology, vol.47, no.1, pp.90-100, 2003.
[11] C.-Y. Liou, J.-C. Huang, and W.-C. Yang, “Modeling word perception using the Elman network,” Neurocomputing, vol.71, no.16-18, pp.3150-3157, 2008.
[12] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv:1312.6114.
[13] L. Dinh, D. Krueger, and Y. Bengio, “Nice: non-linear independent components estimation,” arXiv:1410.8516.
[14] L. Dinh, J. Sohl-Dickstein, and S. Bengio, “Density estimation using Real NVP,” arXiv:1605.08803.
[15] D. P. Kingma and P. Dhariwal, “Glow: generative flow with invertible 1x1 convolutions,” arXiv:1807.03039.
[16] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” arXiv:1406.2661.
[17] J. Shlens, “A tutorial on principal component analysis,” arXiv:1404.1100.
[18] S. Odaibo, “Tutorial: deriving the standard variational autoencoder (VAE) loss function,” arXiv:1907.08956.
[19] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,”
arXiv:1703.10593v7.
[20] A. Vahdat, K. Kreis, and J. Kautz, “Score-based generative modeling in latent space,” arXiv:2106.05931.
[21] J. Yu, Y. Zheng, X. Wang, W. Li, Y. Wu, R. Zhao, and L. Wu, “Fastflow: unsupervised anomaly detection and localization via 2d normalizing flows,” arXiv:2111.07677.
[22] Y. Wang, R. Wan, W. Yang, H. Li, L. P. Chau, and A. C. Kot, “Low-light image enhancement with normalizing flow,” arXiv:2109.05923.
[23] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv:1511.06434v2.
[24] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. of Int. Conf. on Machine Learning (ICML), Haifa, Israel, Jun.21-24, 2010, pp.807-814.
[25] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” arXiv:1411.4038v2.
[26] S.-R. Hsieh, Synthesis of Defect Images for Electronic Components using A Generative Adversarial Network, Master’s Thesis, Dept. of Computer Sci. and Information Eng., National Central University, Chung-li, Taiwan, Jun. 2022.
[27] M. Zeiler, D. Krishnan, G. Taylor, and R. Fergus, “Deconvolutional networks,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, Jun.13-18, 2010, pp.2528-2535.
[28] X. Mao, Q. Li, H. Xie, R. Y.K. Lau, Z. Wang, and S. P. Smolley, “Least squares generative adversarial networks,” arXiv:1611.04076v3.
[29] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” arXiv:1805.08318v2.
[30] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” arXiv:1709.01507v4.
[31] S. Woo, J. Park, J.-Y. Lee, and I. Kweon, “CBAM: convolutional block attention module,” arXiv:1807.06521v2.
[32] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” arXiv:1809.02983v4.
[33] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a local nash equilibrium,” arXiv:1706.08500v6.
[34] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proc. of Int. Conf. on Machine Learning (ICML), Atlanta, GA, Jun.16-21, 2013, pp.1-6.
[35] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance normalization: the missing ingredient for fast stylization,” arXiv:1607.08022v3.
[36] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980v9.
[37] M Tan. and Q. V. Le, “EfficientNet: rethinking model scaling for convolutional neural networks,” arXiv:1905.11946

指導教授

曾定章(Din-Chang Tseng)

審核日期

2023-7-25

推文