模擬深度學習特徵進行條碼偵測

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：16

、訪客IP：3.149.214.24

姓名

李冠達(Kuan-Ta Lee) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

模擬深度學習特徵進行條碼偵測
(Simulation of deep learning features used in barcode detection)

相關論文

★ 獨立成份分析法於真實環境中聲音訊號分離之探討	★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計	★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析	★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型	★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究	★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究	★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除	★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性	★ 教學用電腦模擬生理系統之建構

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

條碼在一般大眾的生活中隨處可見，不同的領域為了各自的需求而製作出符合自身產品的條碼，導致條碼的種類變得非常多，所以僅使用單一方法將所有條碼找出並不容易。隨著近年來深度學習在物件辨識領域有大幅度的進步，因此本研究的目標是利用深度學習來找出條碼。
由於所需要的系統只需要將每一個條碼定位為感興趣區域 (Region Of Interest, ROI)而不必加以分類，為了使系統的執行速度夠快，因此所使用的是較小且計算較快的YOLOv3-tiny的網路架構。訓練圖片來源為利用專業掃描器1504P進行影像蒐集，總共蒐集10008張影像，並將影像以8:2分成訓練資料以及驗證資料，再由欣技資訊提供133張測試影像，而從驗證資料以及測試資料的結果顯示Recall皆可以達到95%，而Precision為93%以及75%。
為了要在處理效能有限的個人數位助理中(PDA)執行，因此我們嘗試了剪枝網路，但效果不佳，而後嘗試解析網路並用影像處理模擬網路行為。將網路視覺化的結果只能看出一些大概的處理過程，最後我們嘗試模擬重要特徵圖，並提出三種模擬方法，第一種方法為先找出條碼範圍，再利用5乘5遮罩找出條碼中心，最後利用主動式輪廓法來框選條碼，而框選的條碼即為ROI，第二種方法在找出條碼範圍以及框選條碼為ROI的步驟與第一種方法相同，但使用較小的3乘3遮罩找出條碼中心，第三種為第二種方法執行完成後，再對ROI進一步篩選，以降低誤判區域的數量，將這三種方法利用由欣技資訊所提供的測試集進行測試的結果顯示，Recall分別為83%、92%以及91%，Precision為81%、46%以及79%，而放入個人數位助理的執行速度為118ms、84ms以及156ms，與其他方法相比，我們的方法在準確度上大概落於平均水準，但在執行速度上有著極大的優勢，因此我們的演算法在速度上是相對有競爭力的。

摘要(英)

Barcodes are ubiguitous in modern life. Different types of barcodes are designed for different applications. Therefore, it is not easy to detect all types of barcodes using a single approach. In recent years, object detection in deep learning has achieved significant progresses, so this research aims to locate barcodes using deep learning.
In order to efficiently execute the system, which only needs to locate the barcode as a region of interest (ROI) without recognizing the type of each barcode, the simple and fast YOLOv3-tiny network has been chosen. Images used for training were captured by the professional scanner 1504P. The number of images were 10008 and then further divided into training data and verification data in 8:2.
The 133 data for testing were provided by CipherLab. The results of verification data and testing data shown that the recall could reach 95%, and the precision was 93% and 75%.
To implement the network in a resource-limited Personal Digital Assistant (PDA), we tried to prune the network, but the performance was not good. Hense we analyzed the network structure and used image processing techniques to imitate the network behavior. By visualizing the network, only coarse processing procedure could be identified. Finally, we tried to imitate some important feature maps with three methods. The first method searched for barcode candidates, located the centers of the barcodes using the 5 5 mask, and then used the active contour technique to frame up the barcode in an ROI. The second method was similar to the first method in finding barcode candidates and output ROI except using the smaller 3 3 mask to search for the center of barcodes in the middle step. The third method extended the second method with an additional processing stage, which filtered the ROI to reduce the number of erroneously detected areas. The recall and precision of three methods by testing data provided by CipherLab were evaluated. The results for these three methods were 83%, 92%, and 91% in the recall, and 81%、46% and 79% in the precision. The execution time of these three methods took 118ms ,84ms and 156ms in PDA, respectively. These three proposed methods were at similar recall and precision compared to other studies, but with significant improvement in running time. Our algorithms were competitive in execution speed compared to other approaches.

關鍵字(中)

★ 物件偵測
★ YOLO
★ 條碼定位
★ 感興趣區域

關鍵字(英)

★ Object detection
★ YOLO
★ Barcode localization
★ Region of Interest

論文目次

摘要 i
Abstract ii
致謝 iv
目錄 v
圖目錄 ix
表目錄 xiv
第一章緒論 1
1.1 研究動機 1
1.2 文獻探討 2
1.2.1 條碼定位 2
1.2.2 物件偵測 5
1.2.3 剪枝 7
1.3 研究目的與貢獻 9
1.4 論文架構 10
第二章研究背景與相關理論 11
2.1 物件偵測Object detection 11
2.2 兩階段偵測 (Two stage detector) 11
2.2.1. RCNN (Region-based Convolutional Neural Networks) 12
2.2.2. Fast-RCNN 13
2.2.3. Faster-RCNN 13
2.3 一階段偵測 (One stage detector) 14
2.3.1. YOLOv1 14
2.3.2. YOLOv2 18
2.3.3. YOLOv3 24
2.4 評估標準 25
2.5 結論 27
第三章物件偵測架構及模擬 28
3.1 物件偵測架構與訓練 28
3.1.1 YOLOv3-Tiny 28
3.1.2 資料蒐集使用儀器 32
3.1.3 資料蒐集 33
3.1.4 訓練網路設備 36
3.1.5 訓練方法與結果 36
3.2 模型壓縮 39
3.3 解析YOLOv3-tiny 40
3.3.1 Grad-CAM(Gradient-weighted Class Activation Mapping) 40
3.4 模擬YOLOv3-tiny 43
3.4.1 找出重要特徵圖 43
3.4.2 尋找條碼範圍 46
3.4.3 尋找條碼中心 52
3.4.4 使用主動式輪廓框選條碼 57
3.4.5 篩選區域 59
3.4.6 演算法統整 63
第四章實驗結果及討論 66
4.1 實驗設備 66
4.1.1. 個人數位助理 (PDA) 66
4.2 範例圖片 67
4.3 模擬結果 68
4.3.1. UI介面 68
4.3.2. 方法一 (5乘5遮罩)實驗結果 70
4.3.3. 方法二 (3乘3遮罩)實驗結果 73
4.3.4. 方法三 (3乘3遮罩並加入篩選區域)實驗結果 76
4.4 結果討論 79
4.4.1. Precision與Recall差別 79
4.4.2. 準確度與執行速度 79
4.4.3. 不同測試集 80
第五章結論與未來展望 86
5.1 結論 86
5.2 未來展望 88
參考文獻 90

參考文獻

Baek, Y., Lee, B., Han, D., Yun, S., & Lee, H. (2019). Character Region Awareness for Text Detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9357-9366. Long Beach, CA, USA.
Biao, L. (2007). A DataMatrix-based Mutant Code Design and Recognition Method Research. International Conference on Image and Graphics, pp. 570-574. Sichuan, China.
Byeon, Y.-H., & Kwak, K.-C. (2017). A Performance Comparison of Pedestrian Detection Using Faster RCNN and ACF. International Congress on Advanced Applied Informatics (IIAI-AAI), pp. 858-863. Hamamatsu, Japan.
Chai, D., & Hock, F. (2005). Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras. International Conference on Information Communications & Signal Processing, pp. 1595-1599. Bangkok, Thailand.
Chandan, G., Jain, A., Jain, H., & Mohana. (2018). Real Time Object Detection and Tracking Using Deep Learning and OpenCV. International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 1305-1308. Coimbatore, India.
Chen, L., Zhang, Z., & Peng, L. (2018). Fast single shot multibox detector and its application on vehicle counting system. IET Intelligent Transport Systems, 12(10), pp. 1406-1413.
Chen, S., & Zhao, Q. (2019). Shallowing Deep Networks: Layer-Wise Pruning Based on Feature Representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12), pp. 3048-3056.
Chu, C.-H., Yang, D.-N., & Chen, M.-S. (2007). Extracting Barcodes from a Camera-Shaken Image on Camera Phones. IEEE International Conference on Multimedia and Expo, pp. 2062-2065. Beijing, China.
Creusot, C., & Munawar, A. (2015). Real-Time Barcode Detection in the Wild. IEEE Winter Conference on Applications of Computer Vision, pp. 239-245. Waikoloa, HI, USA.
Creusot, C., & Munawar, A. (2016). Low-computation egocentric barcode detector for the blind. IEEE International Conference on Image Processing (ICIP), pp. 2856-2860. Phoenix, AZ, USA.
Girshick, R. (2015). Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448. Santiago, Chile.
Han, K., Sun, M., Zhou, X., Zhang, G., Dang, H., & Liu, Z. (2017). A new method in wheel hub surface defect detection: Object detection algorithm based on deep learning. International Conference on Advanced Mechatronic Systems (ICAMechS), pp. 335-338. Xiamen, China.
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both Weights and Connections for Efficient Neural Networks. Advances in Neural Information Processing Systems, pp. 1135–1143. Cambridge , MA , US.
Huang, R., Gu, J., Sun, X., Hou, Y., & Uddin, S. (2019). A Rapid Recognition Method for Electronic Components Based on the Improved YOLO-V3 Network. Electronics, 8(8), p. 825.
Jain, A., Bhattacharjee, S., & Chen, Y. (1992). On texture in document images. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 677-680. Champaign, IL, USA.
Kim, K., Cho, J., Pyo, J., Kang, S., & Kim, J. (2017). Dynamic Object Recognition Using Precise Location Detection and ANN for Robot Manipulator. International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO), pp. 237-241. Prague, Czech Republic.
Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient ConvNets. International Conference on Learning Representations (ICLR), pp. 1-13. Toulon, France.
Liang, Y.-h., & Wang, Z.-y. (2006). A Skew Detection Method for 2D Bar Code Images Based on the Least Square Method. International Conference on Machine Learning and Cybernetics, pp. 3974-3977. Dalian, China.
Liu, Y., Yang, B., & Yang, J. (2008). Bar Code Recognition in Complex Scenes by Camera Phones. International Conference on Natural Computation, pp. 462-466. Jinan, China.
Luo, J.-H., Zhang, H., Zhou, H.-Y., Xie, C.-W., Wu, J., & Lin, W. (2019). ThiNet: Pruning CNN Filters for a Thinner Net. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(10), pp. 2525-2538.
Ma´ rquez-Neila, P., Baumela,, L., & Alvarez, L. (2014). A Morphological Approach to Curvature-Based Evolution of Curves and Surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1), pp. 2-17.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525. Honolulu, HI, USA.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788. Las Vegas, NV, USA.
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), pp. 1137-1149.
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. IEEE International Conference on Computer Vision (ICCV), pp. 618-626. Venice, Italy.
Tropf, A., & Chai, D. (2006). Locating 1-D Bar Codes in Dct-Domain. IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. II-741 - II-744. Toulouse, France.
Xie, L., Ahmad, T., Jin, L., Liu, Y., & Zhang, S. (2018). A New CNN-Based Method for Multi-Directional Car License Plate Detection. IEEE Transactions on Intelligent Transportation Systems, 19(2), pp. 507-517.
Zamberletti, A., Gallo, I., & Albertini, S. (2013). Robust Angle Invariant 1D Barcode Detection. IAPR Asian Conference on Pattern Recognition, pp. 160-164. Naha, Japan.
Zamberletti, A., Gallo, I., Carullo, M., & Binaghi, E. (2010). Neural Image Restoration for Decoding 1-D Barcodes using Common Camera Phones. International Conference on Computer Vision Theory and Applications, pp. 5-11. Angers, France.
Zhang, Chunhui; Wang, Jian; Han, Shi; Yi, Mo; Zhang, Zhengyou. (2006). Automatic Real-Time Barcode Localization in Complex Scenes. International Conference on Image Processing, pp. 497-500. Atlanta, GA, USA.

指導教授

吳炤民(Chao-Min Wu)

審核日期

2021-1-22

推文