以模板比對方式搜尋電子元件影像物件的深度學習系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：50

、訪客IP：18.118.146.20

姓名

郭佩昇(Pei-Sheng Guo) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

以模板比對方式搜尋電子元件影像物件的深度學習系統
(A deep learning system for searching targets in electronic component images with template matching style)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2028-7-31以後開放)

摘要(中)

模板匹配 (template matching) 是傳統電腦視覺中一項重要的技術；但是該技術會受到搜尋物件的諸多變異因素影響；例如，模板與搜尋物件的大小變異、形狀變異、顏色變異、等因素就會嚴重影響搜尋的效果。近年來深度學習 (deep learning) 技術在電腦視覺上的應用日益普遍，且在諸多領域；例如，辨識、偵測、分割等，都獲得相當好的成果；因此本研究將藉由深度學習擷取高階特徵的功能，結合模板匹配的方式，完成不受上述因素影響的“模板匹配深度學習”技術。
模板比對和物件偵測不同的是，在物件偵測中，模型是根據訓練時所學習到的特徵，去尋找在影像中是否有相似的特徵，但使用者無法自行提供任意物件，讓模型去搜尋該物件，若是要搜尋該物件，則需要再重新訓練一次；而模板比對能夠在由使用者提供任意模板影像後，判斷搜尋影像中的物件是否與模板影像中的物件相似，以此達到讓使用者搜尋想找的特定物件。
本研究的目標是使用深度學習網路架構，找出特定形狀特徵的區塊，我們修改單物件追蹤網路 SiamCAR，將其變為多物件模板匹配，修改內容包括：i. 使網路能夠根據輸入的圖片大小而動態地調整，讓網路的輸入資料格式更具有彈性；ii. 將特徵提取子網路進行簡化及優化，除了效果變好也提升網路的速度；iii. 在網路進行匹配預測時使用較小的特徵圖，而不做上採樣，以此提升網路的速度；iv. 加入資料擴增，刻意製造模板影像與搜尋影像間的差異，讓網路可以學習到更多樣性的變化。
在實驗中，我們使用印刷電路板上的物件進行訓練及測試；總共有11,525 對影像，訓練集有 9,267 對，測試集有 2,258 對。最終改進的匹配網路在測試集的召回率是 96.74%，精密度是 94.06%，F-score 是 95.38%，在速度上提升至 53 ms/img，與原本的 512 ms/img 相比提升約 9 倍左右。

摘要(英)

Template matching is an important technique in traditional computer vision. However, this technique can be influenced by various factors when searching for objects. For example, variations in size, shape, and color between the template and the search object can severely affect the effectiveness of the search. In recent years, deep learning technology has become increasingly prevalent in computer vision and has achieved significant results in various fields such as recognition, detection, segmentation, and more. Therefore, this study aims to combine the functionality of deep learning in extracting high-level features with the template matching approach to develop a “template matching deep learning” technique that is robust against the aforementioned factors.
Template matching and object detection differ in that, in object detection, the model is trained to search for similar features in an image based on the features it has learned during training. However, users cannot provide arbitrary objects for the model to search for. If the user wants to search for a specific object, the model needs to be trained again. On the other hand, template matching allows users to provide any template image, and the system determines whether the objects in the search image are similar to the objects in the template image. This enables users to search for specific objects they are looking for.
The objective of this study is to utilize a deep learning network architecture to identify specific features within an image. We modified the single-object tracking network, SiamCAR, to perform multi-object template matching. Our modifications include: i. making the network dynamically adjust based on the input image size to enhance the flexibility of the network′s input data format, ii. simplifying and optimizing the feature extraction ubnetwork to improve both the performance and speed of the network, iii. using smaller feature maps for prediction instead of upsampling to improve network speed, iv. incorporating data augmentation and deliberately creating differences between template images and search images, the network can learn a greater variety of variations.
In the experiments, we trained and tested our model using objects on printed circuit boards (PCBs). The dataset consisted of a total of 11,525 image pairs, with 9,267 pairs in the training set and 2,258 pairs in the testing set. The improved matching network achieved a recall rate of 96.74%, precision rate of 94.06%, and F-score rate of 95.38% on the testing set. Furthermore, the speed was improved to 53 ms/img, which is approximately 9 times faster compared to the original 512 ms/img.

關鍵字(中)

★ 模板比對
★ 深度學習
★ 孿生網路
★ 電腦視覺

關鍵字(英)

★ template matching
★ deep learning
★ siamese network
★ computer vision

論文目次

摘要.................................iii
Abstract.............................iv
致謝..................................vi
目錄..................................vii
圖目錄................................viii
表目錄................................x
第一章緒論............................1
1.1 研究動機與目的.....................1
1.2 系統架構...........................2
1.3 系統特色...........................3
1.4 論文架構...........................4
第二章相關研究.........................5
2.1 影像匹配...........................5
2.2 孿生網路...........................7
2.3 單一目標追蹤網路....................11
第三章孿生匹配網路.....................17
3.1 SiamCAR 網路架構...................17
3.2 基於 SiamCAR 網路的修改.............30
3.3 模板影像和搜尋影像的製作.............35
第四章實驗.............................37
4.1 實驗設備與開發環境..................37
4.2 匹配網路的訓練......................37
4.3 評估準則............................39
4.4 實驗結果............................41
第五章結論與未來展望....................49
參考文獻................................50

參考文獻

[1] D. Guo, J. Wang, Y. Cui , Z. Wang, and S. Chen, “SiamCAR: siamese fully convolutional classification and regression for visual tracking,” arXiv:1911.07241v2.
[2] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv:1512.03385.
[3] M. Sun, J. Xiao, E. G. Lim, B. Zhang, and Y. Zhao, “Fast template matching and update for video object tracking and segmentation,” arXiv:2004.07538.
[4] A. Karami, M. Naseri, and M. Ehsanpour, “A novel approach to object detection using color and edge template matching,” Journal of Real-Time Image Processing, vol.14, no.4, pp.831-839, 2017.
[5] A. V. Aswathy, “Template matching based vehicle license plate recognition,” in Proc. Int. Conf. on Recent Advances in Energy-efficient Computing and Communication (ICRAECC), Nagercoil, India, Mar.7-8, 2019, pp.1-5.
[6] Y. Su and R. A. Robb, “Seed image reconstruction using a template matching technique,” in Proc. Conf. on Medical Imaging: Image Processing, San Diego, CA, Feb.12-17, 2005, pp.1038-1045.
[7] A. V. Ceguerra and I. Koprinska, “Integrating local and global features in automatic fingerprint verification,” in Proc. IEEE Conf. on Int. Conf. on Pattern Recognition (ICPR), Quebec City, Canada, Aug.11-15, 2002, pp.347-350.
[8] M. B. Hisham, S. N. Yaakob, R. A. A. Raof, A. A. Nazren, and N. M. Wafi, “Template matching using sum of squared difference and normalized cross correlation,” in Proc IEEE Conf. on Research and Development (SCOReD), Kuala Lumpur, Malaysia, Dec.13-14, 2015, pp.100-104.
[9] D. G. Lowe, “Object recognition from local scale-invariant features,” in Proc. IEEE Int. Conf. on Computer Vision (ICCV), Kerkyra, Greece, Sep.20-27, 1999, pp.1150-1157.
[10] W. Treible, P. Saponaro, and C. Kambhamettu, “Wildcat: in-the-wild color-and-thermal patch comparison with deep residual pseudo-siamese networks,” in Proc. IEEE Int. Conf. on Image Processing (ICIP), Taipei, Taiwan, Sep.22-25, 2019, pp.1307-1311.
[11] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature verification using a “Siamese” time delay neural network,” in Proc. of Neural Information Processing Systems (NIPS), Denver, Colorado, Nov.29 - Dec.2, 1993, pp.737-744.
[12] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. of 27th Int. Conf on Machine Learning (ICML), Haifa, Israel, Jun.21-24, 2010, pp.807-814.
[13] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), New York, NY, Jun.17-22, 2006, pp.1735-1742.
[14] E. Hoffer and N. Ailon, “Deep metric learning using triplet network,” arXiv:1412.6622v4.
[15] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: a unified embedding for face recognition and clustering,” arXiv:1503.03832v3.
[16] R. Han, W. Feng, J. Zhao, Z. Niu, Y. Zhang, L. Wan, and S. Wang, “Complementary-view multiple human tracking,” in Proc. AAAI Conf. on Artificial Intelligence, New York, NY, Feb.7-12, 2020, pp.10917-10924.
[17] A. Ess, K. Schindler, B. Leibe, and L. Van Gool, “Object detection and tracking for autonomous navigation in dynamic environments,” The International Journal of Robotics Research, vol.29, no.14, pp.1707-1725, 2010.
[18] J. Ciberlin, R. Grbic, N. Teslić, and M. Pilipović, “Object detection and object tracking in front of the vehicle using front view camera,” in Proc. IEEE Conf. on Zooming Innovation in Consumer Technologies (ZINC), Novi Sad, Serbia, May 29-30, 2019, pp.27-32.
[19] C. Luo, X. Yang, and A. Yuille, “Exploring simple 3d multi-object tracking for autonomous driving,” arXiv:2108.10312.
[20] Q. Abdullah, N. S. M. Shah, M. Mohamad, M. H. K. Ali, N. Farah, A. Salh, M. Aboali, M. A. H. Mohamad, and A. Saif, “Real-time autonomous robot for object tracking using vision system,” arXiv:2105.00852.
[21] D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui, “Visual object tracking using adaptive correlation filters,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, Jun.13-18, 2010, pp.2544-2550.
[22] R. Tao, E. Gavves, and A. W. Smeulders, “Siamese instance search for tracking,” arXiv:1605.05863.
[23] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. S. Torr, “Fully-convolutional siamese networks for object tracking,” arXiv:1606.09549v3.
[24] B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu, “High performance visual tracking with siamese region proposal network,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun.18-23, 2018, pp.8971-8980.
[25] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” arXiv:1506.01497v3.
[26] Z. Zhu, Q. Wang, B. Li, W. Wu, J. Yan, and W. Hu, “Distractor-aware siamese networks for visual object tracking,” arXiv:1808.06048.
[27] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. of Neural Information Processing Systems (NIPS), Lake Tahoe, NV, Dec.3-8, 2012, pp.1-9.
[28] Z. Zhang and H. Peng, “Deeper and wider siamese networks for real-time visual tracking,” arXiv:1901.01660v3.
[29] B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan, “SiamRPN++: evolution of siamese visual tracking with very deep networks,” arXiv:1808.06048.
[30] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” arXiv:1703.06870v3.
[31] Q. Wang, L. Zhang, L. Bertinetto, W. Hu, and P. H. S. Torr, “Fast online object tracking and segmentation: a unifying approach,” arXiv:1812.05050v2.
[32] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: fully convolutional one-stage object detection,” arXiv:1904.01355v5.
[33] J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, “UnitBox: An advanced object detection network,” arXiv: 1608.01471
[34] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, “Squeeze-and-excitation networks,” arXiv:1709.01507v4.
[35] S. Woo, J. Park, J.-Y. Lee, and I. Kweon, “CBAM: convolutional block attention module,” arXiv:1807.06521v2.
[36] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” arXiv:1809.02983v4.

指導教授

曾定章(Din-Chang Tseng)

審核日期

2023-7-25

推文