以作者與關鍵字之多領域特性串構小世界

DC 欄位	值	語言
DC.contributor	企業管理學系	zh_TW
DC.creator	蘇育民	zh_TW
DC.creator	Yu-Min Su	en_US
dc.date.accessioned	2009-6-26T07:39:07Z
dc.date.available	2009-6-26T07:39:07Z
dc.date.issued	2009
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=93441023
dc.contributor.department	企業管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	知識管理一個重要的關鍵步驟，就是對文件或文件的作者進行分群；傳統的集群演算法可以使相似的文件或作者相互聚合進而產生數個文件群或作者群，經過集群程序後，每一份文件或作者都可以找出其所隸屬的特定群體。然而，在很多實務應用上卻經常發現，同一文件或作者隸屬於兩個或兩個以上的群體。本論文研究目標，希望能藉由學術文獻共引用資料以發展出一個能將資料進行多領域集群的方法論，對作者共引用資料進行多領域專長作者的研究，以及運用文件的關鍵字集進行跨領域關鍵字的研究，並發展一個跨領域關鍵字推薦系統以強化相關文件或知識庫的搜尋能力。傳統上要將科學引用文獻的作者分群最常使用的方法是使用作者共引用分析(Author Co-citation Analysis; ACA)，它能對被引用文獻的作者進行集群，從而將每位作者歸屬到某一作者群，但是，此方法有一缺點，即它只考量到被引用文獻的第一作者，並對第一作者進行分群，很明顯的，它忽略了論文其餘共同作者對該論文的貢獻。本專題研究計畫提出一個完全作者集配對演算法(Complete Author Pair Algorithm; CAP)，它不像傳統的共引用分析法只能對被引用文獻的第一作者進行分群，它提出一個完全作者集(Complete Author Set)的概念，能將被引用文獻的全部共同作者進行集群，而且能將作者歸屬到多個作者群，從而辨認出具有多領域專長的研究學者；根據對資訊科學類文獻的研究，在美國計算機協會(ACM)的資訊科學分類系統(Computing Classification System; CCS)架構下，大約有10% - 20%的資訊科學社群的學者進行多領域的論文研究寫作，傳統的共引用分析方法無從驗證此類進行多專長跨領域研究的學術活動。本論文研究計畫建置數個知名期刊的被引用文獻作者資料庫，為求資料庫資料的正確性，將使用人工上網擷取資料而不使用程式自動擷取，以免影響實驗結果的精準度；本研究運用完全作者集配對演算法對資訊科學類被引用文獻作者資料庫進行實驗，測試在不同參數值與不同集群方法時，完全作者集配對演算法在辨認多專長作者上的精確率(precision)與回應率(recall)。如同在科學研究社群上的學者會進行多領域之研究，關鍵字也是會具有多領域的性質，在美國計算機協會(ACM)的資訊科學分類系統(Computing Classification System; CCS)架構下，大約有5% - 15%的資訊科學類之關鍵字具有多領域性質，傳統上用來將關鍵字分群以進行文件搜尋的方法是共字分析法(Co-word Analysis)，但此法也無法使同一關鍵字分到不同領域，也就是說，共字分析並不能體現關鍵字具有多領域屬性的事實，本研究運用先前所發展出之完全集(Complete Set)共引用概念，將同一篇學術論文的關鍵字視為一完全關鍵字集(Complete Keyword Set)，發展出完全關鍵字集配對演算法(Complete Keyword Pair Algorithm; CKP)，藉由此方法來找出具有多領域性質的關鍵字，本專題將這種多領域性質的關鍵字稱之為橋性關鍵字 (bridge-keyword)。本論文研究計劃建置一個完全使用人工上網擷取的JACM期刊被引用文獻關鍵字資料庫，來對關鍵字進行完全關鍵字集配對演算法的實驗與測試不同參數值下演算法在找出多領域關鍵字的精確率與回應率，並將運用此多領域關鍵字技術發展跨領域關鍵字推薦系統，以協助文件搜尋者能延伸並擴展其文件搜尋到其他相關領域，突破目前無論是學術界或實務界其運用與發展的集群式推薦系統，都只能在同一領域上進行推薦的現況。	zh_TW
dc.description.abstract	Grouping documents or authors into related domains are crucial steps in implementing Knowledge Management. Traditionally, authors and documents are grouped into one domain only. However, there are many applications, authors and documents should be grouped into multiple groups. The dissertation aims to develop a methodology to cluster data items into multiple groups based co-reference data, namely author co-citation data banks and the keywords co-reference data banks. The author co-citation analysis (ACA) method is commonly used to group authors of reference papers. Since the traditional ACA method analyzes only first authors of reference papers, it disregards the contributions of other coauthors and can only group each first author into one cluster. This study proposes an innovative ACA algorithm called “Complete Author Pair (CAP) algorithm”, which groups complete author sets of reference papers into clusters and thus finds authors who may have expertise in more than one area. Firstly, the CAP algorithm is implemented in a data bank that collected paper references from two IS journals during 2001-2003. The results show that the CAP algorithm can identify multi-expertise authors with 70% of precision, recall, and F score when comparing against ACM CCS. The results also show that CAP algorithm with K-means method and the complete linkage method yield the best performance among six clustering methods evaluated in this experiment. Secondly, the CAP algorithm is implemented in two citation data banks that collected paper references from two ACM journals during 2002-2005. The results show that the CAP algorithm in discovering multi-expertise authors runs up to 90% of average precision in each citation bank when comparing against ACM CCS. The co-word analysis method is commonly used to cluster related keywords into the same keyword domain. In other words, traditional co-word analysis cannot cluster the same keywords into more than one keyword domain, and disregards the multi-domain property of keywords. This study proposes an innovative keyword co-citation algorithm called “Complete Keyword Pair (CKP) algorithm”, which groups complete keyword sets of reference papers into clusters, and thus finds keywords belonging to more than one keyword domain. These keywords are termed as bridge-keywords. A recommendation system based on CKP can recommend keywords in other domains through the bridge keywords to help users extend the document search area. The CKP algorithm is implemented in a JACM citation bank of source papers from JACM during 2000–2006. Results of this study show that the CKP algorithm can discover bridge-keywords with average precision of 80% in the JACM citation bank during 2000–2006 when compared against the benchmark of ACM CCS.	en_US
DC.subject	集群	zh_TW
DC.subject	K平均集群	zh_TW
DC.subject	推薦系統	zh_TW
DC.subject	橋性關鍵字	zh_TW
DC.subject	完全關鍵字集配對演算法(CKP)	zh_TW
DC.subject	聚合層級集群	zh_TW
DC.subject	資訊科學分類系統(CCS)	zh_TW
DC.subject	同現分析	zh_TW
DC.subject	共字分析	zh_TW
DC.subject	關鍵字領域	zh_TW
DC.subject	完全作者集配對演算法(CAP)	zh_TW
DC.subject	完全集	zh_TW
DC.subject	作者共引用分析(ACA)	zh_TW
DC.subject	Recommendation systems	en_US
DC.subject	Bridge-keywords	en_US
DC.subject	Author co-citation analysis	en_US
DC.subject	Complete set	en_US
DC.subject	Complete author pair algorithm(CAP algorithm)	en_US
DC.subject	Clustering	en_US
DC.subject	K-means	en_US
DC.subject	Agglomerative Hierarchical Clustering (AHC)	en_US
DC.subject	Computing classification system (CCS)	en_US
DC.subject	Co-occurrence analysis	en_US
DC.subject	Co-w	en_US
DC.title	以作者與關鍵字之多領域特性串構小世界	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Weaving the Small Worlds with the Multi-domain Property of Authors and Keywords	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 93441023 完整後設資料紀錄