摘要: | 本研究以機器學習與深度學習的方式,協助傳統中醫進行舌診分類,傳統中醫重視舌診,然而舌診的學習過程費時,且標準不易量化,透過機器學習能快速且客觀地進行分類。實驗分為特徵資料與影像資料,特徵資料目的為氣虛證二元分類,以舌頭特徵如舌體顏色、舌苔厚薄、舌下靜脈曲張等資料進行訓練,並以中醫師使用的「舌診辯證」為基礎特徵,與「AI 選取特徵」及「舌診辯證 & AI 特徵」進行比較,並以 Decision Tree、Random Forest 和 Support Vector Machine 進行訓練。影像資料目的是以 RGB 影像進行氣虛證二元分類,第一個實驗是使用 Single-Stage Transfer Learning、Multi-Stage Transfer Learning 和 Transfer Learning with Mixed Data 三種方式進行,將臨床資料當作「主要資料集」,並以書本上的影像當作「輔助資料集」,Single-Stage Transfer Learning 為傳統的遷移學習;Multi-Stage Transfer Learning 則是將輔助資料集當作 Stage 1 的訓練集,主要資料集當作 Stage 2 的訓練集,進行兩階段的遷移學習;Transfer Learning with Mixed Data 是將輔助資料集以特定比例加入主要資料集中,進行單階段的遷移學習。第二個實驗是調整訓練集的實驗對 Transfer Learning 和 Teachable Machine 兩種模型以 Transfer Learning with Mixed Data 的方式找到最適合的混合百分比。實驗結果顯示,在特徵資料中使用 Random Forest 搭配舌診辯證 & AI 特徵有最佳表現 (氣虛證: 88.89%);而影像資料的第一個實驗中Single-Stage、Multi-Stage、Mixed Data 準確率分別為 92.96%、95.93%、93.15%;第二個實驗兩個模型都在輔助資料集加入 85% 時表現最佳 (Transfer Learning: 93.15%, Teachable Machine: 94.44%)。從實驗中發現進行特徵選取時舌診辯證 & AI 特徵不僅能提高準確率,同時還能保留模型的解釋性。另外當可使用的輔助資料集數量多時使用Multi-Stage Transfer Learning 較為適合,輔助資料集數量少時 Transfer Learning with Mixed Data 則是比較好的選擇,適合的混合的比例可以提升模型準確率,但不恰當的比例反而會降低準確率。;This study utilizes machine learning and deep learning techniques to classify tongue diag noses in Traditional Chinese Medicine (TCM). TCM emphasizes tongue diagnosis, which can be time-consuming and difficult to standardize. The classification process can be conducted quickly and objectively by employing machine learning. The experiments are divided into feature data and image data. The purpose of the feature data is to perform binary classification of “Qi Deficiency Syndrome” using tongue features such as tongue color, thickness of tongue coating, and tongue curvature. The training is conducted based on the “Doc Features” used by Chinese medicine practitioners, and a comparison is made between “AI features” and “Doc&AI features” using Decision Tree, Random Forest, and Support Vector Machine. For the image data, the goal is to perform binary classification of “Qi Deficiency Syndrome” using RGB images. The first experiment involves three ap proaches: Single-Stage Transfer Learning, Multi-Stage Transfer Learning, and Transfer Learning with Mixed Data. The clinical data is used as the “main dataset,” while text book images are served as the “supporting dataset.” Single-stage transfer learning follows the traditional transfer learning approach, while multi-stage transfer learning trains on the supporting dataset in stage 1 and the main dataset in stage 2, performing Two-Stage Transfer Learning. Transfer Learning with Mixed Data involves incorporating the sup porting dataset into the main dataset at a specific ratio, conducting single-stage transfer learning. The second experiment involves adjusting the training set to find the optimal mixing percentage for transfer learning and the Teachable Machine model using transfer learning with mixed data. The experimental results demonstrate that for feature data in Qi Deficiency Syndrome, random forest combined with Doc&AI features achieved the best performance (Qi Deficiency Syndrome: 88.89%). In the image data experiments, the ac curacies for Single-Stage, Multi-Stage, and Mixed Data are 92.96%, 95.93%, and 93.15%, respectively. In the second experiment, both models perform best when the support ing dataset is incorporated at 85% (Transfer Learning: 93.15% and Teachable Machine: 94.44%). The experiments reveal that combining Doc&AI features improves accuracy and retains the model’s interpretability during feature selection. Furthermore, when a large amount of support data is available, multi-stage transfer learning is more suitable, while transfer learning with mixed data is a better choice when the number of support datasets is limited. The appropriate mixing ratio can enhance the model’s accuracy, but an inappropriate ratio may lead to decreased accuracy. |