修改和諧密集連接網路做電子元件X-ray影像的瑕疵分割

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：22

、訪客IP：18.224.21.26

姓名

蔡翔宇(Shiang-Yu Tsai) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

修改和諧密集連接網路做電子元件X-ray影像的瑕疵分割
(Defect segmentation for X-ray images of electronic components using modified harmonic DenseNet)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

電子元件 (electroniccomponents) 是所有電子產品的基本元件，電子元件的品質深深影響所有電子產品的品質;因此控制好電子元件焊在印刷電路板上的品質是目前相關業者重要的議題之一。任何產品的製造總避免不了異常情形產生，因此檢測出印刷電路板上的電子元件瑕疵是控制好 “上件印刷電路板” (printed circuit board assembly, PCBA) 出場品質的重要課題。
近些年來，深度學習 (deep learning) 技術的發展突飛猛進，在各行各業都有傑出的表現。自動光學檢測 (automated optical inspection, AOI) 和自動視覺檢測 (automatedvisualinspection,AVI) 領域也不例外，大量引入深度學習技術以同時提升產品的瑕疵檢出率 (detection rate) 與篩除率 (screening rate)。
用於檢測印刷電路板之電子元件瑕疵的深度學習技術有辨識 (recognition)、偵測 (detection)、分割 (segmentation)、異常偵測 (anomaly detection)、等。在本論文的研究中，我們將以語意分割 (semantic segmentation) 技術來找出印刷電路板之電子元件的瑕疵區塊並分類。
我們修改了和諧密集連接網路 (HarDNet-MSEG) 做電子元件 X-ray 影像的瑕疵分割與分類;修改內容包括: i.將解碼器架構設計成 UNet++的形式，使用五層解析度的特徵圖，利於穩定地找尋小瑕疵或較準確的邊界; ii.將感受視野區塊做更改，當中減少複雜的卷積，有利於捕捉小範圍特徵; iii.在最深層的編碼器中加上注意力模組，讓網路排除不必要的刺激，能更關注於重要特徵。
在實驗中，我們收集了 979 張電子元件的瑕疵 X-ray 影像，將其分為訓練集有 881 張及測試集 98 張，在訓練時會將訓練資料擴增八倍。原本 HarDNet-MSEG 的訓練集 MIoU 為 91.53%，召回率為 95.21%，精密度為 95.69%，測試集的 MIoU 為 78.23%，召回率為 83.67%，精密度為 92.05%;經過本研究修改後，訓練集 MIoU 為 95.27%，召回率為 97.86%，精密度為 97.83%，測試集的 MIoU 為 86.56%，召回率為 92.59%，精密度為 93.95%。

摘要(英)

Electronic components are the fundamental elements of all electronic products. The quality of electronic components deeply affects the quality of electronic products. Therefore, keeping the quality of electronic components soldered on printed circuit boards is one of the important issues for the relevant industry tasks. Any manufacturing process inevitably encounters abnormal situations, so detecting defects in electronic components on printed circuit boards is a crucial aspect in ensuring the outgoing quality of printed circuit board assemblies (PCBAs).
In recent years, there is a remarkable advancement in the development of deep learning techniques, they demonstrated outstanding performance in various industries. The fields of automated optical inspection (AOI) and automated visual inspection (AVI) also energetically engage the technique to simultaneously improve the defect detection rate and screening rate of products.
Deep learning techniques have been used for detecting electronic component defects on printed circuit boards include recognition, detection, segmentation, anomaly detection, etc. In this studying, we focus on the application of semantic segmentation technique to identify and classify the defective regions of electronic components on printed circuit boards.
We modified the Harmonic DenseNet MSEG (HarDNet-MSEG) for defect segmentation and classification in X-ray images of electronic components. The properties of this studying include: i. The decoder architecture was designed in the form of UNet++. It utilized four levels of resolution feature maps, which facilitated stable detection of small defects and more accurate boundaries. ii. The Receptive Field Blocks (RFBs) module was modified by reducing complex convolutions. This modification was beneficial for capturing small-scale features effectively. iii. An attention module was added to the deepest layer of the encoder. This allows the network to eliminate unnecessary stimuli and focus more on important features.
In the experiments, we collected 979 X-ray images of electronic components with defects. These images were divided into a training set of 881 images and a test set of 98 images. During training, sample data were augmented into eightfold. The original HarDNet-MSEG model achieves the performance on the training set listing as mean intersection over union (MIoU) of 91.53%, recall of 95.21%, and precision of 95.69%. On the test set, it achieves MIoU of 78.23%, recall of 83.67%, and precision of 92.05%. After the proposed modification, the modified model′s performance is remarkably improved. On the training set, it achieved a MIoU of 95.27%, recall of 97.86%, and precision of 97.83%. On the test set, it achieved a MIoU of 86.56%, recall of 92.59%, and precision of 93.95%.

關鍵字(中)

★ 密集連接網路
★ 和諧密集連接網路
★ 語意分割

關鍵字(英)

★ DenseNet
★ HarDNet
★ semantic segmentation

論文目次

摘要 ii
Abstract iv
致謝 vi
目錄 vii
圖目錄 ix
表目錄 xi
第一章緒論 1
1.1 研究動機與目的 1
1.2 系統架構 2
1.3 論文特色 4
1.4 論文架構 4
第二章相關研究 5
2.1 影像分割 5
2.2 特徵圖的視野範圍 17
2.3 注意力機制 21
第三章改進的語意分割網路 25
3.1 HarDNet-MSEG架構 25
3.2 HarDNet-MSEG架構修改 29
3.3 編碼器 31
3.4 解碼器 34
3.5 損失函數 37
3.6 與類別相關的可視化 39
第四章實驗與結果 40
4.1 實驗設備與開發環境 40
4.2 語意分割網路的訓練 40
4.3 評估準則 44
4.4 實驗結果 45
第五章結論與未來展望 54
參考文獻 55

參考文獻

[1] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proc. of Neural Information Processing Systems (NIPS), Harrahs and Harveys, Lake Tahoe, NV, Dec.3-8, 2012, pp.1106-1114.
[2] C.-H. Huang, H.-Y. Wu, and Y.-L. Lin, “HarDNet-MSEG: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 Mean Dice and 86 FPS,” arXiv:2101.07172.
[3] P. Chao, C.-Y. Kao, Y.-S. Ruan, C.-H. Huang, and Y.-L. Lin, “HarDNet: a low memory traffic network,” arXiv:1909.00948.
[4] S. Liu, D. Huang, and Y. Wang, “Receptive field block net for accurate and fast object detection,” arXiv:1711.07767.
[5] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” arXiv:1608.06993.
[6] S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, “Image segmentation using deep learning: a survey,” arXiv:2001.05566v5.
[7] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” arXiv:1411.4038v2.
[8] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” arXiv:1505.04597v1.
[9] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: a nested U-Net architecture for medical image segmentation,” arXiv:1807.10165v1.
[10] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: redesigning skip connections to exploit multiscale features in image segmentation,” arXiv:1912.05074v2.
[11] F. I. Diakogiannis, F. Waldner, P. Caccetta, and C. Wu, “ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data,” ISPRS Journal of Photogrammetry and Remote Sensing, vol.162, pp.94-114, 2020.
[12] D. Jha, M. A. Riegler, D. Johansen, P. Halvorsen, and H. D. Johansen, “DoubleU-net: a deep convolutional neural network for medical image segmentation,” arXiv:2006.04868.
[13] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv:1512.03385.
[14] V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: a deep convolutional encoder-decoder architecture for image segmentation,” arXiv:1511.00561v3.
[15] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556v6.
[16] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected CRFs,” arXiv:1412.7062v4.
[17] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs,” arXiv:1606.00915v2.
[18] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv:1706.05587v3.
[19] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” arXiv:1802.02611v3.
[20] F. Chollet, “Xception: deep learning with depthwise separable convolutions,” arXiv:1610.02357v3.
[21] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861.
[22] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” arXiv:1703.06870v3.
[23] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” arXiv:1506.01497v3.
[24] R. Girshick, “Fast R-CNN,” arXiv:1504.08083v2.
[25] Z. Cai and N. Vasconcelos, “Cascade R-CNN: high quality object detection and instance segmentation,” arXiv:1906.09756.
[26] K. Chen, J. Pang, J. Wang, Y. Xiong, X. Li, S. Sun, S. Sun, W. Feng, Z. Liu, J. Shi, W. Ouyang, and C. C. Loy, and D. Lin, “Hybrid task cascade for instance segmentation,” arXiv:1901.07518.
[27] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: real-time instance segmentation,” arXiv:1904.02689.
[28] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” arXiv:1708.02002.
[29] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” arXiv:1612.03144.
[30] J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks for few-shot learning,” arXiv:1703.05175.
[31] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT++: better real-time instance segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol.44, no.2, pp.1108-1121, 2022.
[32] H. Liu, R. A. R. Soto, F. Xiao, and Y. J. Lee, “YolactEdge: real-time instance segmentation on the edge,” arXiv:2012.12259.
[33] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei , “Deformable convolutional networks,” arXiv:1703.06211.
[34] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” arXiv:1409.4842.
[35] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” arXiv:1512.00567.
[36] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, inception-ResNet and the impact of residual connections on learning,” arXiv:1602.07261.
[37] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv:1409.0473.
[38] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” arXiv:1709.01507v4.
[39] S. Woo, J. Park, J.-Y. Lee, and I. Kweon, “CBAM: convolutional block attention module,” arXiv:1807.06521v2.
[40] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” arXiv:1809.02983v4.
[41] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.-N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv:1706.03762v5.
[42] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” arXiv:1805.08318v2.
[43] H. Li, P. Xiong, J. An, and L. Wang, “Pyramid attention network for semantic segmentation,” arXiv:1805.10180v3.
[44] Y. Hu, G. Wen, M. Luo, D. Dai, J. Ma, and Z. Yu, “Competitive inner-imaging squeeze and excitation for residual network,” arXiv:1807.08920v4.
[45] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, “ECA-Net: efficient channel attention for deep convolutional neural networks,” arXiv:1910.03151v4.
[46] B. Niu, W. Wen, W. Ren, X. Zhang, L. Yang, S. Wang, K. Zhang, X. Cao, and H. Shen, “Single image super-resolution via a holistic attention network,” arXiv:2008.08767v1.
[47] J.-B. Cordonnier, A. Loukas, and M. Jaggi, “Multi-head attention: collaborate instead of concatenate,” arXiv:2006.16362.
[48] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: transformers for image recognition at scale,” arXiv:2010.11929.
[49] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: hierarchical vision transformer using shifted windows,” arXiv:2103.14030.
[50] S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” arXiv:1502.03167v3.
[51] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. of International Conference on Machine Learning (ICML), Haifa, Israel, Jun.21-24, 2010, pp.807-814.
[52] V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv:1603.07285v2.
[53] Z. Zhang and M. R. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” in Proc. of Neural Information Processing Systems (NIPS), Palais des Congrès de Montréal, Montréal, Canada, Dec.2-8, 2018, pp.8778-8788.
[54] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-Net: fully convolutional neural networks for volumetric medical image segmentation,” arXiv:1606.04797v1.
[55] D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980v9.

指導教授

曾定章

審核日期

2023-7-25

推文