摘要: | 隨著科技的進步,顧客或消費者可以通過各種不同的渠道來發表或分享對該產品質量、服務優缺點;當負面的客訴評價出現時,接著會有許多的網友跟隨回應,有時議題也會因為這樣而引發漣?效應進而受到群眾注意,這些負面的評價我們可以稱之為客訴。目前服務的企業對於社交平台上顧客抱怨(又稱客訴)的處理大多是客戶服務中心人員以人工方式來取得顧客抱怨評價留言進而進一步處理,在時效性上常會緩不濟急。客訴的留言通常也具有高度可用可提取的信息,這些客訴通常帶有不滿的情緒或者對於希望該產品求好的心態,分析這些客訴這對於組織而言是很重要的。 我們通過Google Play平台的取得評價留言資料集做為本次研究的資料集,該資料集的期限區間從2014年1月1日至2020年4月30日之間共有31401筆數據,將這些非結構化的客訴留言使用監督式機器學習方式來逐一進行本文探勘(Text Mining)、特徵詞萃取 (Feature Extraction) ,以Orange探勘工具分析特徵詞,並建立關鍵字詞庫 (Bag Words) 接著進行建模(Topic Model) 、標記(Labeled)、以樸素貝葉斯(Na?ve Bayes, NB) 、k最近鄰居法(k-nearest neighbors, KNN)、隨機森林(Random Forest, RF)、支持向量機 (support vector machine, SVM) 等四種研究上較常應用在分類預測等研究演算法來對這些客訴問題進行分析以及問題類型分類預測,模型主要分為六個模型(Topic Model),研究發現在六分類方法 (Multi-Class Classification) 上複合詞性的語料庫較預測準確率比單一詞性語料庫較佳,而二分類方法 (Binary Classification) 則以單一詞性語料庫中的動作及物動詞準確度較佳,證實本研究可有效的預測客訴問題分類(Prediction customer complaint Classification),可節省人工對客訴問題分類的時間。 關鍵字:文字探勘、分類預測、監督式機器學習 ;With the advancement of technology, customers or consumers can publish or share the pros and cons of the product or service through a variety of different channels; when negative customer opinions or evaluations appear, many natives will follow to respond, and sometimes issues will also be discussed. Because of this, ripple effect is caused and attention of the masses is attracted. These negative comments can be called guest complaints. At present, the service companies deal with so called customer complaints on social platforms, and most of them use customer service center personnel to manually obtain customer complaints and further process them, which often slows down in timeliness. The messages of customer complaints usually also have a high degree of useful information. These customer complaints usually contain dissatisfactions and hopes for improvements, which is very important for the organization. Data sets used in this research are gathered from user reviews between January 1, 2014 and April 30th, 2020 on Google Play platform, 31,401 data sets in total. A In this article, customer complaints analyzation and problem category prediction are accomplished based on Supervised Machine Learning Methods, for instance, Naive Bayesian Calculations. After feature words extracted from unstructured user complaints and analyzed with Orange exploration tools, a keyword vocabulary was built, modelled and labelled, which includes six main dimensions.This research shows that Multi-Class Classification has higher prediction accuracy on compound keyword database, comparing with Binary Classification, which has higher accuracy when applied on keyword database with single transitive verbs. It is also proved that customer complaints could be efficiently classified and saved time from manual classifications. Keywords: Text Mining, classification prediction, supervised learning |