摘要(英) |
I employed the YOLOv4 technique to achieve real-time dynamic image detection and segmentation, and I focused on tongue feature recognition to present my results. Tracking dynamic tongue images is a challenge in the research. It is because the pixel distributions of the lip and cheek are similar to that of the tongue. Thus, my research aims to develop a new technique of detection and segmentation to deal with dynamic tongue images. In terms of biomedical applications, this technique can generate "de-identified" tongue images after segmentation. Thus Chinese medical physicians can use tongue diagnosis techniques to identify symptoms or find features related to diseases. YOLOv4 and R-CNN-based methods are two mainstream techniques in the field of computer vision. The R-CNN-based methods can predict the possible locations of multiple objects and then determine the type of the target object after obtaining the target position. Thus, the recognition of R-CNN-based methods is in high accuracy with the costs of high computation complexity and time consumption. On the other hand, the YOLOv4 technique can simultaneously predict the classified result and the location information of an input image. Thus, the YOLOv4 technique has a significant reduction in computational complexity. Additionally, the YOLOv4 technique has a significant progression in accuracy over the YOLOv3 version. Relevant literature shows that under the conditions of TeslaV100 GPU hardware and 54 FPS (frames per second), the YOLOv4 technique will have 41.2% performance in AP (average precision). Under the same hardware conditions, the R-CNN technique has an accuracy of 42.8% in AP, but its FPS is only 9. Therefore, the YOLOv4 architecture became the basic framework for real-time image detection and segmentation in my research. The YOLOv4 technique can only determine object locations but also provide the corresponding coordinates. Thus, it could help us to achieve edge detection and segmentation in practice. To detect the required precise angle of the tongue, I also added the method of negative sampling into the model. Then I proposed a new framework by utilizing a double backbone structure. The preliminary results show that FPS 7-10 can be obtained under Windows compiling with Visual Studio and GTX 1050 Ti and RAM 16GB hardware conditions. In terms of detection accuracy, it precisely generates images of the required angle of the tongue without skew angle circumstances. By utilizing the confidence score provided by YOLOv4, the predicted results can also reach a grade of more than 90% or more. |
參考文獻 |
[1] Jiang, B., Liang, X., Chen, Y. et al. Integrating next-generation sequencing and traditional tongue diagnosis to determine tongue coating microbiome. Sci Rep 2, 936 (2012)
[2] Y. Hsu, Y. Chen, L. Lo and J. Y. Chiang, "Automatic tongue feature extraction," 2010 International Computer Symposium (ICS2010), 2010, pp. 936-941, doi: 10.1109/COMPSYM.2010.5685377.
[3] Lo LC, Cheng TL, Chiang JY, Damdinsuren N. Breast cancer index: a perspective on tongue diagnosis in traditional chinese medicine. J Tradit Complement Med. 2013 Jul;3(3):194-203. doi: 10.4103/2225-4110.114901
[4] Qi Z, Tu LP, Chen JB, Hu XJ, Xu JT, Zhang ZF. The Classification of Tongue Colors with Standardized Acquisition and ICC Profile Correction in Traditional Chinese Medicine. Biomed Res Int. 2016;2016:3510807.doi: 10.1155/2016/3510807
[5] Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection”, arXiv:2004.10934 [cs.CV]
[6] Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo, “CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features”, arXiv:1905.04899 [cs.CV]
[7] Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN”, arXiv:1911.11929 [cs.CV]
[8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun,“Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition”, arXiv:1406.4729 [cs.CV]
[9] Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang,“UnitBox: An Advanced Object Detection Network”, arXiv:1608.01471 [cs.CV]
[10] Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, Silvio Savarese, “Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression”, arXiv:1902.09630 [cs.CV]
[11] Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, Dongwei Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression”, arXiv:1911.08287 [cs.CV]
[12] https://github.com/tzutalin/labelImg
[13] Kang Kim, Hee Seok Lee, “Probabilistic Anchor Assignment with IoU Prediction for Object Detection ”, arXiv:2007.08103 [cs.CV] |