European Journal of Business Science and Technology 2017, 3(2):106-117 | DOI: 10.11118/ejobsat.v3i2.100
Source Code Plagiarism Detection for PHP Language
- 1 Mendel University in Brno, Czech Republic
This paper introduces a system for detection of plagiarism in source codes written in the PHP computer language, part of the plagiarism detection tool Anton. We used the greedy string tiling algorithm together with tokenization and hash calculation. The efficiency of the system was tested on both an artificial dataset and on real data coming from a course taught at our university. Our results are compared with other similar systems and solutions, concluding that Anton can detect all examined types of plagiarism with higher accuracy than other systems.
Keywords: source-code plagiarism, anti-plagiarism system, PHP, Anton
JEL classification: C88, I23
Prepublished online: November 30, 2017; Published: December 31, 2017 Show citation
References
- Arwin, C. and Tahaghoghi, S. M. M. 2006. Plagiarism Detection across Programming Languages. In: Proceedings of the 29th Australasian Computer Science Conference, vol. 48, pp. 277-286.
- Bretag, T. 2015. Handbook of Academic Integrity. USA: Springer. ISBN 978-981-287-097-1.
Go to original source...
- Clough, P. 2000. Plagiarism in Natural and Programming Languages: an Overview of Current Tools and Technologies. Sheffield: Department of Computer Science, University of Sheffield. [online]. Available at: http://ir.shef.ac.uk/cloughie/papers/plagiarism2000.pdf. [Accessed 2017, October 31].
- Flores, E., Barron-Cedeno, A., Rosso, P. and Moreno, L. 2011. Towards the Detection of Cross-Language Source Code Reuse. Proceedings of 16th International Conference on Applications of Natural Language to Information Systems, NLDB2011. Springer. ISBN 978-3-642-22326-6.
Go to original source...
- Floryček, J. 2015. Optimalizace antiplagiátorského řešení na Mendelově univerzitě v Brně. Brno: Mendelova univerzita v Brně. [online]. Available at: http://theses.cz/id/vgizl0/zaverecnaprace.pdf. [Accessed 2017, October 31].
- Foltýnek, T., Procházka, T. and Rybička, J. 2009. Plagiarism Detection System at Mendel University in Brno, Czech Republic. [DVD-ROM]. In IVKI 2009. Inovácia výskumu katedier informatiky, pp. 50-53. ISBN 978-80-8094-579-4.
- Checksims. 2015. GitHub - Checksims. [online]. Available at: https://github.com/Checksims/ checksims. [Accessed 2017, December 20].
- Heon, M. and Murvihill, D. 2015. Program Similarity Detection with Checksims: A Major Qualifying Project Report. [online]. Available at: https://web.wpi.edu/Pubs/E-project/Available/E-project-043015-122310/unrestricted/CheckSims.pdf. [Accessed 2017, October 25].
- Jamieson, S. 2015. Is it Plagiarism or Patchwriting? Toward a Nuanced Definition. In Bretag, T. (ed.). Handbook of Academic Integrity. USA: Springer. ISBN 978-981-287-097-1.
Go to original source...
- JPlag. 2017. JPlag - Detecting Software Plagiarism. [online]. Karlsruhe: Institute for Program Structures and Data Organization. Available at: https://jplag.ipd.kit.edu/. [Accessed 2017, April 5].
- Joy, M., Cosma, G., Yau, J. Y. and Sinclair, J. 2011. Source Code Plagiarism - A Student Perspective. [online]. IEEE Transactions on Education, 54 (1), 125-132. DOI: 10.1109/TE.2010.2046664. Available at: http://ieeexplore.ieee.org/document/5451097/. [Accessed 2017, October 25].
Go to original source...
- Joy, M. and Luck, M. 1999. Plagiarism in Programming Assignments. [online]. IEEE Transactions on Education, 42 (2), 129-133. Available at https://pdfs.semanticscholar.org/f161/83ebb570fe9d485a5d36f415e94215cf9ad3.pdf. [Accessed 2017, October 27].
Go to original source...
- Joy, M. 2014. Sherlock - Plagiarism Detection Software. [online]. Available at: http://www2.warwick.ac.uk/fac/sci/dcs/research/ias/software/sherlock/. [Accessed 2017, October 27].
- Krpec, O. 2015. Plagiarism Recognizer in PHP Source Code. Excel@FIT 2015 Conference Proceedings. [online]. Available at: http://excel.fit.vutbr.cz/submissions/2015/076/76.pdf. [Accessed 2017, October 26].
- Lancaster, T. and Culwin, F. 2004. A Comparison of Source Code Plagiarism Detection Engines. [online]. Computer Science Education, 14 (2), 101-112. DOI: 10.1080/08993400412331363843. Available at: http://www.tandfonline.com/doi/abs/10.1080/08993400412331363843.[Accessed 2017, October 25].
Go to original source...
- Lauer, H. C. 2015. Extensions and Enhancements for Checksims. In: Computer Science WPI. [online]. Available at: http://web.cs.wpi.edu/~lauer/MQP/Checksims_MQP_topics.htm. [Accessed 2017, October 25].
- Mirza, O. and Joy, M. 2015. Style Analysis For Source Code Plagiarism Detection. In: Plagiarism Across Europe and Beyond: Conference Proceedings. Brno: MENDELU, pp. 53-61. ISBN 978-80-7509-267-0.
- MOSS. 2017. A System for Detecting Software Similarity [online]. Available at: http://theory.stanford.edu/~aiken/moss. [Accessed 2017, April 30].
- Moussiades, L. and Vakali, A. 2005. PDetect: A Clustering Approach for Detecting Plagiarism in Source Code Datasets. [online]. The Computer Journal, 48 (6), 651-661. DOI: 10.1093/comjnl/bxh119. Available at: http://academic.oup.com/comjnl/article/48/6/651/358280/PDetect-A-Clustering-Approach-for-Detecting. [Accessed 2017, October 27].
Go to original source...
- Mozgovoy, M., Fredriksson, K., White, D., Joy, M. and Sutinen, E. 2005. Fast Plagiarism Detection System. In Consens, M. and Navarro, G. (eds.). String Processing and Information Retrieval. [online]. Springer, pp. 267-270. DOI: 10.1007/11575832_30. Available at: http://link.springer.com/10.1007/11575832_30. [Accessed 2017, October 27].
Go to original source...
- Murao, H. and Ohno, A. 2011. A Two-step In-class Source Code Plagiarism Detection Method Utilizing Improved CM Algorithm and SIM. International Journal of Innovative Computing, Information and Control, 7 (8), 4729-4739. Available at: http://www.ijicic.org/ijicic-10-05012.pdf. [Accessed 2017, October 31].
- Parker, A. and Hamblen, J. O. 1989. Computer Algorithms for Plagiarism Detection. [online]. IEEE Transactions on Education, 32 (2), 94-99. DOI: 10.1109/13.28038. Available at: http://ieeexplore.ieee.org/document/28038/. [Accessed 2017, October 31].
Go to original source...
- Prechelt, L., Malpohl, G. and Philippsen, M. 2000. JPlag: Finding Plagiarisms Among a Set of Programs. Karlsruhe: Fakultat fur Informatik Universit at Karlsruhe. [online]. Available at: http://page.mi.fu-berlin.de/prechelt/Biblio/jplagTR.pdf. [Accessed 2017, October 31].
- Schleimer, S., Wilkerson, D. S. and Aiken, A. 2003. Winnowing. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data - SIGMOD '03 New York, p. 76. DOI: 10.1145/872757.872770.[online]. Available at: http://portal.acm.org/citation.cfm?doid=872757.872770. [Accessed 2017, October 25].
Go to original source...
- Shao, Z. 2015. Compilers and Interpreters. New Haven: Yale University. [online]. Available at: http://flint.cs.yale.edu/cs421/lectureNotes/c02.pdf. [Accessed 2016, November 17].
- Sherlock. 2017. The Sherlock Plagiarism Detector. [online]. Available at: http://www.cs.usyd.edu.au/~scilect/sherlock/. [Accessed 2017, October 27].
- The PHP Group. 2017. PHP - Tokenizer. [online]. Available at: http://php.net/manual/en/book.tokenizer.php. [Accessed 2017, May 14].
- Všianský, R. 2017. Rozpoznávání podobností zdrojových kódů v systému Anton. Brno: MENDELU.
- Všianský, R. and Dlabolová, D. 2016. Deployment and Improvements of System Anton. In: PEFnet 2016. Brno: MENDELU.
This is an open access article distributed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.