Eng During last 365 days Approved articles: 2071,   Articles in work: 302 Declined articles: 784 
Library

Sidorkina I. G., Belousov S. A., Khukalenko K. S., Nekhoroshkova L. G. Algorithm for plagiarism detection in software source code

Published in journal "Software systems and computational methods", 2013-3 in rubric "Knowledge Base, Intelligent Systems, Expert Systems, Decision Support Systems", pages 268-271.

Resume: Programming is characterized by a variety rules, techniques, methods and means of its implementation, applied depending on qualification, experience and individual peculiarities of programmers. The authors analyze the different algorithms of source code plagiarism detection and semantic noise values calculated by those methods for different source codes. The article presents algorithm based on the combined approaches of several text and semantic algorithms, shows the form to which the source code is transferred in the majority of modern algorithms, describes classes of the modern algorithms for plagiarism detection in software source code. As a result the authors present an improved algorithm for plagiarism detection suggested for use in educational practice to detect plagiarism in students works. The given algorithm combines features of both text and semantic algorithms, the computational part has high parallelization, which lowers the execution time in presence of computation power.

Keywords: plagiarism, source code, program code, token, semantics, semantic algorithms, matching coefficient, coefficient of commonality, metric, combined algorithm

DOI: 10.7256/2305-6061.2013.3.9602

This article can be downloaded freely in PDF format for reading. Download article

Bibliography:
Wise M.J. String similarity via greedy string tiling and running Karp-Rabin matching. // Dept. of CS, University of Sydney. December 1993.
Baxter I., Yahin A., Moura L., Anna M.S., BierL. Clone Detection Using Abstract Syntax Trees. // Proceedings of ICSM. IEEE. 1998.
Prechelt L., Malpohl G., Philippsen M. JPlag: Finding plagiarisms among a set of programs. // Technical Report No. 1/00, Universityof Karlsruhe, Department of Informatics. March 2000.
Moussiades L.M., Vakali A. PDetect: A Clustering Approach for Detecting Plagiarism in Source Code Datasets. // The Computer Journal Advance Access. June 24, 2005
Manber U. Finding similar files in a large filesystem. // Proceedings of the USENIX Winter 1994 Technical Conference. San Francisco. 1994. P. 110.
Huang X., Hardison R.C., Miller W. A space-efficient algorithm for local similarities. // Computer Applications in the Biosciences 6. 1990. P. 373381.

Correct link to this article:
just copy this link to clipboard