摘要: | 舌診為中醫辨證論治的重要指標,是一種簡便,非侵入性之檢查。舌下脈絡(Sublingualvein, SV) 為中醫判斷瘀阻與血瘀證的象徵。目前已經有許多研究顯示SV 瘀阻程度與其疾病嚴重程度正相關,但是SV 的診斷常常因為不同醫師間一些的經驗、色彩感知、心態等主觀因素,造成不同的判讀結果。本研究的目標希望發展機器學習為基礎的電腦輔助系統,協助診斷病人的舌下脈絡瘀阻程度。我們考慮二元分類問題,亦即輕微和嚴重兩級。我們測試許多監督式機器學習模型,包括支援向量機(SVM)、K 鄰近法、決策樹、RidgeClassifier 等等,最後篩選出13 個模型進行實驗。為了提 升準確率與節省訓練時間,本研究使用了兩種提取特徵的技術,一是利用主成分分析(Principle Component Analysis, PCA) 結合逆向切片迴歸(Sliced Inverse Regression,SIR) 進行提取,或是利用卷積神經網路(Convolutional Neural Network, CNN)提取。經過數值實驗發現,使用原始照片且灰階影像時,準確率平均後只有60.5% 左右,在去除雜訊,並且使用紅綠藍彩色模式三通道都保留後,機器學習的平均準確率可以 到達81%,而13 個模型之中,最好的模型SVM_linear 準確率可以到85.5%。另外,PCA+SIR 降維方法,可以在準確率維持的情況下,使得訓練時間只需原本的1/35 左右,達到節省運算成本的效果,最後,我們也使用的卷積神經網路,進行特徵提取,13 個模型中的RidgeClassifier 準確率更是可以到達87.5%。最後,再配合混淆矩陣及接收者操作特徵曲線(receiver operating characteristic curve, ROC 曲線),我們就可以提供醫生,科學化的分辨舌下脈絡嚴重度的輔助工具。;Tongue diagnosis is an important indicator of traditional Chinese medicine (TCM) syndrome differentiation and treatment. It is a simple and non-invasive examination. Sublingual vein (SV) is a symbol of TCM to judge sublingual blood stasis . There have been many theses showing that the degree of SV stasis is positively correlated with the severity of the disease. However, the diagnosis of SV is often due to subjective factors such as experience,color perception, and mentality among different physicians, resulting in different interpretation results. The goal of this research is to develop a computer-assisted system based on machine learning to assist in diagnosing the degree of SV stasis in patients. We consider the problem of binary classification, mild and severe. We test many supervised machine learning models, including support vector machines (SVM), K-nearest neighbor,decision trees, RidgeClassifier, etc. In order to improve accuracy and save training time, this research uses two techniques to extract features. One is to use Principle Component Analysis (PCA) combined with Sliced Inverse Regression (SIR) for extraction. Another way is using Convolutional Neural Network (CNN) for extraction. Experiments results have found that when using original photos and grayscale images, the average accuracy is only about 60.5%. After removing the noise, and the red, green, and blue three channels are retained, the average accuracy of machine learning can reach 81%.Among the thirteen models, the best model SVM_linear has an accuracy of 85.5%. In addition, we using PCA+SIR, with the accuracy maintained, the training time is only about 1/35 of the original, achieving the goal of saving computing costs. Finally, We also use the CNN for feature extraction, and RidgeClassifier of the thirteen models has highest accuracy 87.5%. All of the above, with the combination of the confusion matrix and receiver operating characteristic curve (ROC curve), we can provide doctors a scientific system to diagnose the severity of SV stasis . |