隨著網路的蓬勃發展,電子商務興起,以及web2.0技術的廣泛應用,愈來愈多人在網路上表達個人對於產品與服務之使用意見。許多的討論區、專業評論網站 (例如epinon.com,Amazon) 以及個人網誌,亦提供使用者抒發己見的空間。由此可知,線上評論是為買賣雙方獲取參考資訊的重要來源。然而網路的評論通常會混合著正面與負面意見,若以人工處理方式去從中取得具參考價值之訊息,勢必要耗費甚多精力與時間。因此,如何彙整與分析大量的網路文字資料,尤其是針對具有豐富語意資訊的使用者評論,自動化意見探勘,實為重要之研究議題。 回顧過去意見探勘之研究得知,特徵表示法是用來反映網路評論文章之特性,透過特徵選取的方法以提供分類訓練模型進行學習,本研究發現評論分類的領域中,最常採用之特徵表示法,大多是單一字詞的頻率。此類型之特徵表示法對於分類器而言,容易產生維度太大或增加雜訊,進而影響分類效果,有鑑於此,本研究針對特徵表示法的部分進行改良,利用feature-opinion pair來代表向量空間模型之特徵,使特徵表示法能包含更多的語意訊息。 本研究所提出之改良特徵表示法,係以監督式學習演算法為基礎,針對文章之特性進行分類。透過所截取之產品與服務的特徵(feature)與使用者意見 (opinion)來形成feature-opinion pair,以建立向量空間模型。並採用支援向量機(support vector machine)來做為本研究之分類器,來測試我們所收集之資料集。實驗結果顯示,本研究提出之方法能夠有效的降低建立向量空間模型之維度,並提升分類之準確率。 The emergence of Internet has constructed a space (e.g. epinions.com, amazon.com) for users to freely express opinions and exchange experiences regarding products, services, and any public issues. Nowadays a great amount of referral information can be obtained from a variety of information source, including products profile, recommendations, expert opinion and so forth. However, identification of the semantic orientation from referral information requires a lot of human efforts. Therefore, the study of opinion mining has been extended to this field. In prior studies of opinion mining, feature representation has been the key method. Bag-of-word is one of the most popular feature representation that describes reviewing contents as single-word sets. However, applying bag-of-word model to online reviews usually are lack of semantic information and will significantly increase vector dimension to reduce the performance of machine learning classifier. This study proposed a modified feature presentation method for building vector space model. Feature-opinion pair will be extracted from product features and user comments at sentence level. We use support vector machine as our classification method to test our dataset. These experiments indicate that the proposed method can not only increase the accuracy of classification but also reduce time cost with fewer dimensions. Finally, we expect that our system could be used to solve the high dimension problem in review classification.