博碩士論文 974203033 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator余東霖zh_TW
DC.creatorTung-lin Yuen_US
dc.date.accessioned2010-7-14T07:39:07Z
dc.date.available2010-7-14T07:39:07Z
dc.date.issued2010
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=974203033
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract在過去已有許多關於判斷新聞類別的研究,但這些研究僅注重於技術層面,也就是如何在現有的演算法架構之上,發展出更有效率或更正確的演算法,卻忽略了以人的觀點來進行新聞分類,即模仿新聞工作者真正在進行新聞分類的流程。因此,本研究模仿專家在進行新聞分類時的流程來發展演算法。在實際與新聞工作者訪談之後,我們發現專家在進行新聞分類時的流程大致上可分為兩個步驟;首先,快速瀏覽新聞文章,找尋具代表性或能協助他們進行分類的關鍵字。其次,若找到的關鍵字無法協助他們進行分類,或關鍵字在新聞類別內的代表性不足,則進一步仔細檢視整篇新聞內容。   模仿並依循著我們所觀察到的專家知識與分類流程,本研究將新聞分類演算分為兩步驟;在訓練階段,首先,本研究使用「分類關聯規則」找出各個類別的代表性關鍵字,其次,每個類別底下再使用「分群」方法產生子類別。在測試階段,首先利用分類關聯規則找尋符合的分類規則,若規則的信心水準度不足,則進一步比對新聞和子類別的相似度,找出最合適的新聞類別。實驗顯示本研究所提出的專家導向方法相較於傳統技術導向方法,擁有更好且更穩定的分類正確率。 zh_TW
dc.description.abstractThe news classification problem is concerned with how to assign the correct category for the unclassified news. Although a large number of past studies have studied this problem, a common weakness of these studied is that their classification algorithms were usually designed from technical perspective and they seldom considered how experts really classify the news in a practical classification process. In this research, we first observe how media workers classify news in their daily operations, and we find that their classification process mainly consists of the following operations. (1) If some important keywords or phrases are present in the news, then they directly assign the news to certain categories. (2) Otherwise, they must check in details the whole content of news to determine which category it should belong to. (3) Since a news category may contain several independent but related subcategories, the news is usually classified by assigning it to the most appropriate subcategory, which can in turn determine its category.   By imitating the above working process, we proposed a news classification algorithm. In the learning phase, we use associative classification rules to find representative keywords in each category. In addition, we further generate a number of subcategories by clustering news under each category. In the classification phase, we assign unclassified news the most appropriate category by using associative classification rules if rules’ confidence is high enough. Otherwise, we will determine the category by measuring the similarity between unclassified news and subcategories. The experimental comparison shows that our approach has better and more stable classification performance than traditional algorithms. en_US
DC.subject分群zh_TW
DC.subject分類關聯規則zh_TW
DC.subject文字探勘zh_TW
DC.subject新聞分類zh_TW
DC.subjectText Miningen_US
DC.subjectNews Classificationen_US
DC.subjectClusteringen_US
DC.subjectAssociative Classification Ruleen_US
DC.title以兩階段分類方法識別新聞類別zh_TW
dc.language.isozh-TWzh-TW
DC.titleTwo-phase Classification Approach for Identifying News Categoryen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明