關鍵字:文字探勘、文字探勘與檢索、相似度分析、階層式分群 本研究將文字探勘與檢索技術與相性做結合並應用於『國立中央大學校內法規及延伸之校外法規』,並建立於雲端平台上來做法規分類化處理。 文字探勘與檢索技術只能呈現一種衡量量化方法,無法呈現多元化的選擇,因此透過相性並搭配餘弦相似性、階層式分群法等技術,使得一篇法規可在不同的相性產生不同的結果,透過分類可產生多元化的選擇來協助使用者找尋到適合的相關法規。;This study combines Term Frequency-Inverse Document Frequency technique with compatibility and applies it to the “Regulations of National Central University and Extensions of Off-campus Regulations” and establishes them on the cloud platform for tax classification. Term Frequency-Inverse Document Frequency technique can only present one type of measurement and quantitative method and is not capable of presenting diverse selection. Therefore, through the combination of compatibility, Cosine Similarity, Hierarchical Clustering and other techniques, a regulation can produce different results in different compatibility. A wide range of selection can be produced through classification, helping users to find the proper regulations which is related.
keyword:text mining、TF-IDF、Cosine Similarity、Hierarchical Clustering This study combines Term Frequency-Inverse Document Frequency technique with compatibility and applies it to the “Regulations of National Central University and Extensions of Off-campus Regulations” and establishes them on the cloud platform for tax classification. Term Frequency-Inverse Document Frequency technique can only present one type of measurement and quantitative method and is not capable of presenting diverse selection. Therefore, through the combination of compatibility, Cosine Similarity, Hierarchical Clustering and other techniques, a regulation can produce different results in different compatibility. A wide range of selection can be produced through classification, helping users to find the proper regulations which is related.
keyword:text mining、TF-IDF、Cosine Similarity、Hierarchical Clustering This study combines Term Frequency-Inverse Document Frequency technique with compatibility and applies it to the “Regulations of National Central University and Extensions of Off-campus Regulations” and establishes them on the cloud platform for tax classification. Term Frequency-Inverse Document Frequency technique can only present one type of measurement and quantitative method and is not capable of presenting diverse selection. Therefore, through the combination of compatibility, Cosine Similarity, Hierarchical Clustering and other techniques, a regulation can produce different results in different compatibility. A wide range of selection can be produced through classification, helping users to find the proper regulations which is related.