Hybrid System for Plagiarism Detection on A Scientific Paper

Farah K. AL-Jibory, et. al.


Plagiarism Detection Systems are critical in identifying instances of plagiarism, particularly in the educational sector whenever it comes to scientific publications and papers. Plagiarism occurs when any material is copied without the author's consent or attribution. To identify such acts, thorough knowledge of plagiarism types and classes is required. It is feasible to detect several sorts of plagiarism using current tools and methodologies. With the advancement of information and communication technologies (ICT) and the availability of online scientific publications, access to these publications has grown more convenient. Additionally, with the availability of several software text editors, plagiarism detection has become a crucial concern. Numerous scholarly articles have previously examined plagiarism detection and the two most often used datasets for plagiarism detection, WordNet and the PAN Dataset. The researchers described verbatim plagiarism detection as a straightforward case of copying and pasting, and then shed light on clever plagiarism, which is more difficult to detect since it may involve original text alteration, borrowing ideas from other studies, and Other scholars have said that plagiarism can obscure the scientific content by substituting terms, deleting or introducing material, rearranging or changing the original publications. The suggested system incorporated natural language processing (NLP) and machine learning (ML) techniques, as well as an external plagiarism detection strategy based on text mining and similarity analysis. The suggested technique employs a mix of Jaccard and cosine similarity. It was examined using the PAN-PC-11 corpus. The proposed system outperforms previous systems on the PAN-PC-11, as demonstrated by the findings. Additionally, the proposed system obtains an accuracy of 0.96, a recall of 0.86, an F-measure of 0.86, and a PlagDet score of 0.86. (0.86). 0.865 and the proposed technique is substantiated by a design application that is used to detect plagiarism in scientific publications and generate non-medication notifications. Portable Document Format (PDF) .

Article Details