摘要: | 程式碼抄襲檢測技術對於程式設計課程是相當重要的,現在的比對技術以屬性、結構以及混合為主。本研究使用基於抽象語法樹和編碼的比對方式。我們針對抽象語法樹的節點給予自定義編碼符號,以及針對每個程式碼區塊括號,例如:函式、迴圈等,給予不同的括號編碼符號,並針對各種程式抄襲行為進行條件處理,經過此條件處理,原始程式碼輸出的編碼格式可與抄襲程式碼輸出的編碼格式完全一致,便可以有效檢測出相似型態、行為相似以及位置順序調換問題。最後,透過本研究使用的演算法便可計算出相似度數值,使用者可透過此相似度數值來評斷兩方程式之間的抄襲可能性。只要知道某抄襲行為與程式碼之間的對應條件,透過本研究的方法便可以檢測出該程式碼抄襲行為。本研究將這個工具稱為PASTE (Plagiarism checker by Abstract Syntax Tree and Encoding)。;The code plagiarism detection technology is very important for programming assignments. And, the current matching technology is mainly based on attribute, structure, and hybrid. In this paper, we encode the nodes of the abstract syntax tree. We define customized encoding symbols for the nodes of the abstract syntax tree. And, we define different bracket encoding symbols for each code block, such as functions, loops, etc., In addition, we use conditional encoding for various code plagiarism behaviors. After conditional encoding, the encoding format from the source code can be exactly the same as the encoding format from the plagiarism code. With this method, we can effectively detect similar types, similar behaviors, and position order exchange problems. Finally, by using the algorithm we proposed in this paper, we can calculate the similarity value, and users can judge the possibility of plagiarism from this similarity value. As long as we know the relationship between a certain plagiarism behavior and the source code, the code plagiarism behavior can be detected by our method. We name this tool PASTE (Plagiarism checker by Abstract Syntax Tree and Encoding). |