博碩士論文 107552005 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:43 、訪客IP:3.137.218.215
姓名 孫承德(Cheng-Te Sun)  查詢紙本館藏   畢業系所 資訊工程學系在職專班
論文名稱 結合多重解析度、可變形卷積、與自我注意力的瑕疵辨識系統
(A Defect Recognition System integrated by Multi-resolution, Deformable Convolution, and Self-attention)
相關論文
★ 適用於大面積及場景轉換的視訊錯誤隱藏法★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮★ 外觀守恆及視點相關的多重解析度模塑
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 印刷電路板 (Printed circuit board, PCB) 是電子元件的支撐體,可廣泛應用在電視、手機、電腦、汽車等電子工業產品。一般檢查印刷電路板是否為良品,會使用自動光學檢查 (automated optical inspection, AOI) 設備進行外觀檢查,但因為印刷電路板產業對良率有極高要求,易受到製程誤差、樣板對位不準確、光學演算法設定過於嚴格等問題而產生過多檢查誤報,需要大量人工進行二次複檢找出真實瑕疵進行確認,導致產能無法有效提高。
在本論文中我們使用結合卷積神經網路 (convolution neural network, CNN) 與自我注意力的 CoAtNet 為基礎,建構一套瑕疵辨識系統。我們在系統中分別加入特徵金字塔網路 (Feature pyramid network, FPN) 模組建構多重解析度,將低階尺度特徵與高階尺度特徵融合,改進小尺度瑕疵辨識不易問題;使用可變形卷積網路 (Deformable convolution network, DCN) 模組取代 CoAtNet 部分卷積運算,透過在卷積採樣點增加偏移量,以強化模型在樣本幾何變換的泛用性;另外我們在影像前處理進行也進行改進,包含:i. 將印刷電路板影像與對應的母板影像經影像處理後進行通道併聯,藉由增加樣本特徵多樣性提升網路的辨識效果;ii. 使用影像強化增強資料的特徵以檢出不明顯瑕疵;iii. 使用資料擴增 (data augmentation) 增加樣本的多樣性避免模型過擬合 (overfitting)。
在實驗中,我們比較各個前處理方法對系統進行的影響,分析瑕疵檢測系統添加改進模組前後的效益,以及比較 SGD、RMSprop、Adam、AdamW 四種最佳化法,與 StepLR、CosinAnnealingLR、ExponentialLR、ReduceLROnPlateau 四種學習策略對模型的效益,最終改進的網路架構在召回率的測試結果達到了 98.5642%,精確率達到了 98.7558%,相較於原版本的網路架構召回率提升了 1.5%,精確率提升了 0.65%,且沒有增加過多推論時間,可以在低漏失的前提下,有效改善 AOI 設備檢查誤報過多導致延誤生產的問題。
摘要(英) Printed circuit board is the foundation of electronic components, and could be widely applied in electronic industrial products such as televisions, cell phones, computers, and cars. Generally, the automated optical inspection equipment is used for visual inspection to check whether the printed circuit board is acceptable product. However, printed circuit board industry has extremely high requirements on yield, and is easily affected by manufacturing process error, template alignment error, and strictly optical algorithm setting. It will bring out lots of false alarm, and need large human effort to verify real defects which make it’s inefficient to improve the production capacity.
In this paper, we construct a defect recognition system based on CoAtNet, which integrated convolutional neural network and self-attention. We added feature pyramid network to establish multi-resolutions, which can consolidate low-level scale features with high-level scale features to improve the problem of small-scale defect identification. We replaced the partial convolution operation of CoAtNet with deformable convolution network module by increasing the offset at the convolution sampling point in order to enhance the generality of the model in sample geometric transformation. We also improve the image pre-processing, including: i. Concat printed circuit board image with it’s mother board image in the defect classification, which can improve lower scale defect identification; ii. Use image enhancement algorithm to enhance the features of the data to detect inconspicuous defects; iii. Utilize data augmentation to increase sample diversity and avoid model overfitting..
In the experiment, we compared the impact of each pre-processing method on the system, analyzed the benefits of the defect recognition system before and after the integration of each module, and tested the benefits of four optimizers, including SGD、RMSprop、Adam、AdamW, and four learning schedules as StepLR、CosinAnnealingLR、ExponentialLR、ReduceLROnPlateau. The last improved network architectures recall rate reached 98.5642%, and the precision rate achieved 98.7558%. To compare with the original version network architectures, recall rate increased by 1.5% and the precision rate is increased by 0.65% without too much inference time. It could also effectively improve the problem of excessive false positives from automated optical inspection equipment inspections under the low false negatives, and avoid the production delay.
關鍵字(中) ★ 深度學習
★ 瑕疵檢測
★ 卷積神經網路
★ 自我注意力機制
★ 多重解析度
★ 可變形卷積
關鍵字(英) ★ deep learning
★ convolution neural network
★ self-attention
★ multi-resolution
論文目次 摘要 i
Abstract ii
致謝 iv
目錄 v
圖目錄 vi
表目錄 ix
第一章 緒論 1
1.1 研究動機 1
1.2 系統架構 2
1.3 系統特色 3
1.4 論文架構 3
第二章 相關研究 4
2.1 卷積神經網路辨識系統的相關研究 4
2.2 卷積神經網路的輕量化 8
2.3 卷積神經網路的注意力機制 14
第三章 PCB 瑕疵辨識網路架構 23
3.1 CoAtNet 架構 23
3.2 應用於 PCB 的 CoAtNet 架構 30
第四章 實驗 37
4.1 實驗設備與環境 37
4.2 卷積神經辨識網路訓練 37
4.3 卷積神經辨識網路的比較與評估 40
第五章 結論及未來展望 53
參考文獻 54
參考文獻 [1] Z. Dai, H. Liu, Q. V. Le, and M. Tan, "CoAtNet: marrying convolution and attentionfor all data sizes," arXiv:2106.04803.
[2] S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. H. Romeny, J. B. Zimmerman, and K. Zuiderveld, “Adaptive histogram equalization and its variations,” Computer Vision, Graphics, and Image Processing, vol.39, no.3, pp.355-368, 1987.
[3] E D. Cubuk, B. Zoph, D. Mané, V. Vasudevan, and Q. V. Le, “AutoAugment: learning augmentation strategies from data,” arXiv:1805.09501v3.
[4] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “RandAugment: practical automated data augmentation with a reduced search space,” arXiv:1909.13719v2.
[5] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol.86, no.11, pp.2278-2324, Nov. 1998.
[6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, Dec.3-8, 2012, pp.1097-1105.
[7] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv:1409.1556.
[8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Int Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[9] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition ," in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[10] M. Lin, Q. Chen, and S. Yan, “Netwok in network,” in Proc. Int. Conf. Learn. Represent (ICLR), Banff, Canada, Apr.14-16, 2014, pp.274-278.
[11] F. N. Iandola, S. Han, W. Moskewicz, K. Ashraf, W. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size,” arXiv: 1602.07360.
[12] A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: efficient convolutional neural networks for mobile vision applications,′′ arXiv:1704.04861.
[13] F. Chollet, ′′Xception: deep learning with depthwise deparable convolutions,′′ in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.1800-1807.
[14] X. Zhang, X. Zhou, M. Lin, and J. Sun, ′′ShuffleNet: an extremely efficient convolutional neural network for mobile devices,′′ arXiv:1707.01083.
[15] G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, ′′Densely connected convolutional networks,′′ in Proc. IEEE Conf. on Pattern Recognition and Computer Vision (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.4700-4708.
[16] M. Guo, T. Xu, J. Liu, Z. Liu, P. Jiang, T. Mu, S. Zhang, R. R. Martin, M. Cheng, and S. Hu, ′′Attention mechanisms in computer vision: a survey,′′ arXiv:2111.07624.
[17] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks,′′ arXiv:1709.01507v4.
[18] V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, "Recurrent models of visual attention," arXiv:1406.6247.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. Neural Information Processing Systems (NIPS), Long Beach, CA, Dec.4-9, 2017, pp.5998-6008.
[20] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. Int. Conf. Learn. Represent (ICLR), Vienna, Austria, May.3-7, 2021, pp.1-21.
[21] S. Woo, J. Park, J. Lee, and I.S. Kweon, “CBAM: convolutional block attention module,” arXiv: 1807.06521v2.
[22] J. Li, J. Wang, Q. Tian, W. Gao, and S. Zhang, “Global-local temporal representations for video person re-identification,” in Proc. of IEEE/CVF Int. Conf. on Computer Vision (ICCV), Seoul, Korea, Oct.27-Nov.2, 2019, pp.3958-3967.
[23] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp.1735-1780, 1997.
[24] Y. Fu, X. Wang, Y. Wei, and T. Huang, “Sta: Spatial temporal attention for large-scale video-based person reidentification,” in Proc. of AAAI Conf. on Artificial Intelligence, Honolulu, Hawaii, Jan.27-Feb.1, 2019, vol.33, pp.8287-8294.
[25] X. Li, W. Wang, X. Hu, and J. Yang, “Selective kernel networks,” in Proc. of IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, Jun.16- Jun.20, 2019, pp.510-519.
[26] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.C. Chen, “MobileNetV2: inverted residuals and linear bottlenecks,” arXiv:1801.04381.
[27] D. Hendrycks and K. Gimpe, “Gaussuan error liner units (GELUS),” arXiv:1606.08415.
[28] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” arXiv:1612.03144.
[29] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and J. Yang, “Deformable convolutional networks” in Proc. of IEEE/CVF Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22- Oct.29, 2017, pp.764-773.
[30] X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable ConvNets v2: more deformable, better results” in Proc. of IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, Jun.16-Jun.20, 2019, pp.9308-9316.
[31] E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and J. M. Ogden, “Pyramid methods in image processing,” RCA engineer, vol.29, no.6, pp.33-41, 1984.
[32] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, “SSD: single shot multiBox detector,” arXiv:1512.02325.
[33] D. G. Lowe, “Object recognition from local scale-invariant features” in Proc. of IEEE/CVF Int. Conf. on Computer Vision (ICCV), Kerkyra, Greece, Sep.20- Sep25, 1999, pp.1150-1157.
[34] Y. L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proc. International Conference on Machine Learning (ICML), Haifa, Israel, Jun.21-Jun.24, 2010, pp.111-118.
[35] Q. Li, S. Jin, and J. Yan, “Mimicking very efficient network for object detection” in Proc. IEEE Conf. on Pattern Recognition and Computer Vision (CVPR), Honolulu, Hawaii, Jul.22-Jul.25, 2017, pp.6356-6364.
[36] Z. Zhang and M. R. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” in Proc. of Neural Information Processing Systems (NIPS), Palais des Congrès de Montréal, Montréal, Dec.2-8, 2018, pp.8778-8788.
[37] I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” arXiv:1608.03983.
[38] D. P. Kingma and J. L. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980.
[39] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv:1711.05101.
指導教授 曾定章(Din-Chang Tseng) 審核日期 2022-8-30
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明