藉由資料探勘的排序方式提昇程式碼搜尋品質─以Koders為例

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	廖振傑	zh_TW
DC.creator	Jhen-jie Liao	en_US
dc.date.accessioned	2009-7-13T07:39:07Z
dc.date.available	2009-7-13T07:39:07Z
dc.date.issued	2009
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=964203015
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	隨著開放原始碼軟體的普及與日益倍增，有愈來愈多的開放原始碼可以從網路上取得。因而興起了一種新的網路服務─程式碼搜尋。程式碼搜尋引擎提供了程式開發者一個便利的管道，幫助程式開發者快速使用一些已經存在的類別或架構所提供的應用程式介面 (Application Programming Interfaces, APIs) ，藉此提昇軟體生產效率。然而這些從網路上所取得的程式碼搜尋結果，往往無法有效的解決程式開發者的需求。主要是因為有許多相似或不相關的檔案出現於程式碼搜尋結果之中，造成程式開發者無法快速取得有用的程式碼。因此本研究提出一個改良搜尋引擎的系統架構，透過自己撰寫的網頁擷取程式將 Koders 的搜尋結果存取至資料庫當中；再透過本研究定義的資料前處理動作，進行資料清理。不只是使用關鍵字搜尋還考慮到程式的結構化特性；之後再透過資料探勘的階層演算法進行分群與重新排序，並且在每一個群集上賦予新的標籤，希冀可以使得搜尋結果更符合使用者的需求。最後本研究使用案例的方式來解釋所提出的系統架構是否可以有效改善搜尋結果，並且與相關的學術研究做比較與分析。	zh_TW
dc.description.abstract	With the popularity of open source software, there are more and more source codes could be downloaded over the Internet. Thus a new Internet service, code search engine emerged. Code search engine provides a convenient way to help developers to reuse existing Application Programming Interfaces (APIs) and improve software productivity. However, these search results obtained from the code search engine cannot effectively satisfy developers’ needs. This is because there are many unrelated files appear in code search results and it makes the developer couldn’t get useful code quickly. Therefore, we propose a system architecture to improve the existing search engine. First, we develop a web program to extract the Koders’ search results and store the data to the local repository. Second, we define a rule to filter unrelated files and parse these files into the database format in the data preprocessing stage. Third, some data mining algorithms were used to cluster and re-rank the Koders’ search results. Fourth, we use some unique tags to identify clusters and expect the search results can satisfy the developers’ needs. Finally, we use a case to explain whether the proposed system architecture can effectively help developers to find out the useful source code, and compare with related prior research.	en_US
DC.subject	資料探勘	zh_TW
DC.subject	開放原始碼	zh_TW
DC.subject	程式碼搜尋引擎	zh_TW
DC.subject	階層演算法	zh_TW
DC.subject	群集分析	zh_TW
DC.subject	Cluster Analysis	en_US
DC.subject	Code Search Engine	en_US
DC.subject	Open Source Code	en_US
DC.subject	Data Mining	en_US
DC.title	藉由資料探勘的排序方式提昇程式碼搜尋品質─以Koders為例	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	Using Data Mining Technology to Refine Koders Code Search Results	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 964203015 完整後設資料紀錄