dc.description.abstract | This study utilizes machine learning and deep learning techniques to classify tongue diagnoses in Traditional Chinese Medicine (TCM). TCM emphasizes tongue diagnosis, which can be time-consuming and difficult to standardize. The classification process can be conducted quickly and objectively by employing machine learning. The experiments are divided into feature data and image data. The purpose of the feature data is to perform binary classification of “Qi Deficiency Syndrome” using tongue features such as tongue
color, thickness of tongue coating, and tongue curvature. The training is conducted based on the “Doc Features” used by Chinese medicine practitioners, and a comparison is made
between “AI features” and “Doc&AI features” using Decision Tree, Random Forest, and Support Vector Machine. For the image data, the goal is to perform binary classification
of “Qi Deficiency Syndrome” using RGB images. The first experiment involves three approaches: Single-Stage Transfer Learning, Multi-Stage Transfer Learning, and Transfer
Learning with Mixed Data. The clinical data is used as the “main dataset,” while textbook images are served as the “supporting dataset.” Single-stage transfer learning follows
the traditional transfer learning approach, while multi-stage transfer learning trains on the supporting dataset in stage 1 and the main dataset in stage 2, performing Two-Stage Transfer Learning. Transfer Learning with Mixed Data involves incorporating the supporting dataset into the main dataset at a specific ratio, conducting single-stage transfer learning. The second experiment involves adjusting the training set to find the optimal mixing percentage for transfer learning and the Teachable Machine model using transfer learning with mixed data. The experimental results demonstrate that for feature data in Qi Deficiency Syndrome, random forest combined with Doc&AI features achieved the best performance (Qi Deficiency Syndrome: 88.89%). In the image data experiments, the accuracies for Single-Stage, Multi-Stage, and Mixed Data are 92.96%, 95.93%, and 93.15%,
respectively. In the second experiment, both models perform best when the supporting dataset is incorporated at 85% (Transfer Learning: 93.15% and Teachable Machine: 94.44%). The experiments reveal that combining Doc&AI features improves accuracy and retains the model’s interpretability during feature selection. Furthermore, when a large amount of support data is available, multi-stage transfer learning is more suitable, while transfer learning with mixed data is a better choice when the number of support datasets is limited. The appropriate mixing ratio can enhance the model’s accuracy, but an inappropriate ratio may lead to decreased accuracy. | en_US |