使用YOLO架構在標準環境中進行動態舌頭影像偵測及切割

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：24

、訪客IP：3.145.91.3

姓名

王珅愷(Shen-Kai Wang) 查詢紙本館藏

畢業系所

生物醫學工程研究所

論文名稱

使用YOLO架構在標準環境中進行動態舌頭影像偵測及切割
(Detection and Segmentation of Dynamic Tongue Images Using YOLO Technique in a Standardized Environment)

相關論文

★ 基於密度泛函理論的人體姿勢模態識別之非監督學習方法	★ 舌紋分析的動態曝光方法
★ 整合Modbus與Websocket協定之聯網醫療資料採集嵌入式系統研製	★ 比較 U-net 神經網路與資料密度泛函方法對於磁共振影像分割的效能
★ 使用YOLO辨識金屬表面瑕疵	★ 使用深度學習結合快速資料密度泛函轉換進行自動腦瘤切割
★ 使用強化學習模擬抑制新冠肺炎疫情	★ 融合影像與加速度感測訊號的人體上部運動特徵視覺化之機械學習模型
★ 組建細胞培養人造磁場微實驗平台	★ 標準CMOS製程之新型微機電麥克風驗證、濕式蝕刻加工製程開發暨量產製程研究
★ 靜磁場於癌細胞的生物效應	★ 關節角度監測裝置應用在日常膝關節活動

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本研究的目標是利用YOLOv4技術達成即時的動態影像檢測及切割，其中將以舌頭特徵辨識作為技術呈現的對象。本技術之所以選擇追蹤動態舌頭影像作為本論文所發展之檢測方法的挑戰，其理由在於舌頭周圍的嘴唇與臉頰之像素分布與舌頭非常相近，故以動態舌頭影像的檢測及切割作為呈現本技術之目標。於生醫領域應用方面，本研究業與桃園區聯新國際醫院中醫師以及智慧醫療實驗室合作，以提供分割後「去識別化」的舌頭影像交付中醫師利用舌診手法為病患判別病徵與演算法標記。技術比較上，YOLOv4與目前電腦視覺較為熱門的R-CNN不同。R-CNN能先預測多個物體可能存在之位置，並在獲得目標位置後再依序判斷目標的類別，因此辨識精確度非常高，然而如此的辨識技術代價就是時間複雜度較高。而YOLOv4是在圖像輸入的同時便將圖像處理成同時帶有圖像分類及位置資訊的格式後再執行預測。預測輸出時便可同時得出要辨識物體的種類以及位置，因此在計算時間可以大幅下降。而在精確度上，YOLOv4比起v3更有顯著的提升。在YOLOv4的研究文獻中表明，在使用TeslaV100 GPU的硬體條件以及在54 FPS (frames per second)的表現下，AP (average precision)會有41.2%的達成率。而在相同硬體中R-CNN在AP有42.8%的準確度，但是FPS僅有9。故本研究選擇YOLOv4作為即時影像檢測及切割的基本架構。同時在這個基礎上，除了利用YOLOv4去判斷出物體的位置，本研究將再進一步利用YOLOv4給出的座標實踐邊緣檢測以及分割。而為了達成精確地偵測出符合中醫檢驗的需求，本研究提供了一標準化的影像採集箱並在YOLO架構中加進了負樣本一同訓練並於本論文中構造出雙骨幹架構(double backbone structure)的YOLOv4技術。目前實驗結果表明，在Windows系統下使用Visual Studio編譯，並在GTX 1050 Ti與RAM 16GB的硬體條件下，可獲得FPS 7至10的結果。而偵測上的成果更能符合中醫師所需的舌頭角度圖像，歪斜角度的圖像已由副樣本骨幹移出了資料數據。而準確度的實踐上，若使用YOLOv4提供之confidence score，亦皆可獲得90%以上的分數。

摘要(英)

I employed the YOLOv4 technique to achieve real-time dynamic image detection and segmentation, and I focused on tongue feature recognition to present my results. Tracking dynamic tongue images is a challenge in the research. It is because the pixel distributions of the lip and cheek are similar to that of the tongue. Thus, my research aims to develop a new technique of detection and segmentation to deal with dynamic tongue images. In terms of biomedical applications, this technique can generate "de-identified" tongue images after segmentation. Thus Chinese medical physicians can use tongue diagnosis techniques to identify symptoms or find features related to diseases. YOLOv4 and R-CNN-based methods are two mainstream techniques in the field of computer vision. The R-CNN-based methods can predict the possible locations of multiple objects and then determine the type of the target object after obtaining the target position. Thus, the recognition of R-CNN-based methods is in high accuracy with the costs of high computation complexity and time consumption. On the other hand, the YOLOv4 technique can simultaneously predict the classified result and the location information of an input image. Thus, the YOLOv4 technique has a significant reduction in computational complexity. Additionally, the YOLOv4 technique has a significant progression in accuracy over the YOLOv3 version. Relevant literature shows that under the conditions of TeslaV100 GPU hardware and 54 FPS (frames per second), the YOLOv4 technique will have 41.2% performance in AP (average precision). Under the same hardware conditions, the R-CNN technique has an accuracy of 42.8% in AP, but its FPS is only 9. Therefore, the YOLOv4 architecture became the basic framework for real-time image detection and segmentation in my research. The YOLOv4 technique can only determine object locations but also provide the corresponding coordinates. Thus, it could help us to achieve edge detection and segmentation in practice. To detect the required precise angle of the tongue, I also added the method of negative sampling into the model. Then I proposed a new framework by utilizing a double backbone structure. The preliminary results show that FPS 7-10 can be obtained under Windows compiling with Visual Studio and GTX 1050 Ti and RAM 16GB hardware conditions. In terms of detection accuracy, it precisely generates images of the required angle of the tongue without skew angle circumstances. By utilizing the confidence score provided by YOLOv4, the predicted results can also reach a grade of more than 90% or more.

關鍵字(中)

★ 影像偵測
★ 切割

關鍵字(英)

★ YOLO
★ Detection
★ Segmentation

論文目次

中文摘要 i
英文摘要 ii
致謝 iv
目錄 vi
圖目錄 vii
表目錄 ix
一、緒論 1
1-1 舌診之演進 1
1-2 本論文之目的與改進 2
二、YOLO(You Only Look Once) 5
2-1 YOLO理論介紹 5
2-1-1 Input介紹 6
2-1-2 Backbone介紹 7
2-1-3 Neck介紹 8
2-2 YOLO損失函數介紹 11
2-2-1 IOU-loss 12
2-2-2 GIOU-loss 13
2-2-3 DIOU-loss 15
2-2-4 CIOU-loss 16
2-3 理論總結 16
三、研究內容與方法 18
3-1 程式訓練流程與設計 18
3-1-1 負樣本之應用 23
3-2 照相箱設計 27
四、結果與分析 29
五、結論 33
參考文獻 35

圖目錄
頁次
圖1：舌診槍實體圖 2
圖2：錯誤伸舌頭範例 3
圖3：YOLOv4架構圖 5
圖4：Mosaic範例圖 7
圖5：CSPNet流程圖 8
圖6：YOLOv4的CSPNet架構圖 8
圖7：CNN問題圖 9
圖8：YOLOv4的SPP架構圖 9
圖9：FPN架構圖 10
圖10：PAN架構圖 10
圖11：一般和YOLOv4的PAN差異 11
圖12：IOU示意圖 12
圖13：IOU-loss重合度問題圖 12
圖14：GIOU示意圖 13
圖15：GIOU-loss 與IOU-loss差異圖 14
圖16：GIOU-loss重疊問題圖 14
圖17：GIOU-loss最小包圍框相同問題圖 14
圖18：DIOU-loss示意圖 15
圖19：正確舌頭角度圖 18
圖20：標記舌頭範例 19
圖21：訓練結果 19
圖22：影片測試結果 20
圖23：webcam即時檢測結果 20
圖24：無視窗圖 21
圖25：保存目標框圖 21
圖26：不同的舌頭角度測試結果 22
圖27：正負樣本應用圖 23
圖28：正負樣本圖 24
圖29：正負樣本標籤檔 24
圖30：正負樣本架構圖 25
圖31：第二次訓練結果 25
圖32：第二次訓練對於歪斜角度結果 26
圖33：照相箱參考範例 27
圖34：照相箱設計圖 28
圖35：固定器設計與實體圖 28
圖36：照相箱實體圖 29
圖37：燈管位置圖 29
圖38：照相流程圖 30
圖39：收案流程圖 32
圖40：偵測切割結果圖 34

參考文獻

[1] Jiang, B., Liang, X., Chen, Y. et al. Integrating next-generation sequencing and traditional tongue diagnosis to determine tongue coating microbiome. Sci Rep 2, 936 (2012)
[2] Y. Hsu, Y. Chen, L. Lo and J. Y. Chiang, "Automatic tongue feature extraction," 2010 International Computer Symposium (ICS2010), 2010, pp. 936-941, doi: 10.1109/COMPSYM.2010.5685377.
[3] Lo LC, Cheng TL, Chiang JY, Damdinsuren N. Breast cancer index: a perspective on tongue diagnosis in traditional chinese medicine. J Tradit Complement Med. 2013 Jul;3(3):194-203. doi: 10.4103/2225-4110.114901
[4] Qi Z, Tu LP, Chen JB, Hu XJ, Xu JT, Zhang ZF. The Classification of Tongue Colors with Standardized Acquisition and ICC Profile Correction in Traditional Chinese Medicine. Biomed Res Int. 2016;2016:3510807.doi: 10.1155/2016/3510807
[5] Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection”, arXiv:2004.10934 [cs.CV]
[6] Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo, “CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features”, arXiv:1905.04899 [cs.CV]
[7] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN”, arXiv:1911.11929 [cs.CV]
[8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun,“Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition”, arXiv:1406.4729 [cs.CV]
[9] Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang,“UnitBox: An Advanced Object Detection Network”, arXiv:1608.01471 [cs.CV]
[10] Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, Silvio Savarese, “Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression”, arXiv:1902.09630 [cs.CV]
[11] Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression”, arXiv:1911.08287 [cs.CV]
[12] https://github.com/tzutalin/labelImg
[13] Kang Kim, Hee Seok Lee, “Probabilistic Anchor Assignment with IoU Prediction for Object Detection ”, arXiv:2007.08103 [cs.CV]

指導教授

陳健章(Chien-Chang Chen)

審核日期

2021-8-10

推文