肇因於軟體的快速變動,軟體程式庫及應用程式框架通常缺少完備的文件說明。同時,數以萬計的開發人員和一些組織已開發超過三十萬個開放源始碼專案(open-source projects),而這類軟體工程資料已形成了一個豐富的知識庫。這類知識在軟體實作或軟體維護階段可被用於提昇軟體開發人員的開發效率及效能。為針對這些待解問題,在本研究中,我們提出MACs (Mining API Code snippets for code reuse) 方法用於探勘軟體程式碼,以及提出MSCS (Multi-Segment Code Search) 方法用於搜尋程式碼。在探勘程式碼方面,我們應用了資料探勘技術於程式碼專案,以指引開發人員透過相關API (application programming interface) 使用樣式(usage patterns),如「開發人員寫了這列程式碼敘述,同時也寫了...」,給定一組程式檔案,找出的關聯樣式(association-rule patterns)能建議相關的程式碼,進而形成物件導向程式的結構;而探勘出來的序列樣式(sequential-rule patterns)能在方法裡預測可能的API序列。在搜尋程式碼方面,我們將程式碼劃分為三類型區段(segments):metadata區段、code-data區段及structural-data區段,然後應用不同的stemming及stop-word過濾處理,以建立多區段索引資料庫,用於進一步搜尋程序。在初步的評估中,我們提出一些實驗針對MACs和MSCS進行效用性(usefulness)評估。實驗結果顯示MACs系統有顯著的潛力以協助軟體開發人員,此外,在對MSCS的實驗中也指出,我們的多區段程式碼搜尋方案提供了更多的程式碼搜尋機制,使得有更多相關程式碼被找到。 In software development, lack of API (application programming interface) documents and lack of knowledge on how to use specific APIs still need to be addressed. Moreover, with more than 300,000 open-source projects created by millions of software developers and organizations, the software engineering data have formed a great and rich knowledge base. Such knowledge can be used to improve software developers’ efficiency and effectiveness. To address these issues and assist software developers, we propose the MACs (Mining API Code snippets for code reuse) approach for mining source codes and the MSCS (Multi-Segment Code Search) scheme for searching source codes. In MACs, we apply data mining to source code projects to guide developers through related API-usage patterns: “Developers who code the program statement also code….” Given a set of source code files, the mined association rules suggest related code snippets to form the components of object-oriented programs. The mined sequential rules predict likely additional API sequences within a method. In MSCS, we segment source code files into three types of segments: meta-data segment, code-data segment and structural-data segment, then applies different stemming and stop-word filtering processes to build a multi-segment index database for further search. Our preliminary evaluation shows that MACs has significant potential to assist developers, especially API newcomers, and provides an alternative method for code reuse. In addition, the experimental results of MSCS indicate that our approach provides a more flexible source code search mechanism that allows a greater number of relevant items to be found.