Hallo ihr Lieben,
in Uni they always told us to not cheat during the programming homework as they will find out plagiarism with automatic systems anyway. As the programming part of my studies was always easy for I never had the need and did not further think about that systems. But actually there is quire some research going into plagiarism detection systems for code. Super interesting ideas come into play. Todays paper is one of the most cited in that area and uses a clustering approach for the detection…check it out and learn why you got caught cheating (if you did :P).
Efficient detection of plagiarism in programming assignments of students is of a great importance to the educational procedure. This paper presents a clustering oriented approach for facing the problem of source code plagiarism. The implemented software, called PDetect, accepts as input a set of program sources and extracts subsets (the clusters of plagiarism) such that each program within a particular subset has been derived from the same original. PDetect proposes the use of an appropriate measure for evaluating plagiarism detection performance and supports the idea of combining different plagiarism detection schemes. Furthermore, a cluster analysis is performed in order to provide information beneficial to the plagiarism detection process. PDetect is designed such that it may be easily adapted over any keyword-based programming language and it is quite beneficial when compared with earlier (state-of-the-art) plagiarism detection approaches.