Duplication, co-submission and plagiarism are rising phenomena in modern scientific publishing, as the number of peer-reviewed journals and the perceived chances of escaping detection are increasing. On the other side, electronic indexes and new text-searching tools such as the search engine eTBLASTmight provide an effective deterrent of unethical publications. Though manual inspection is unavoidable in the end, automatic detection might strongly reduce the work required. However, the size of online databases makes a full search impractical even by algorithmic tools. In this paper, we consider the problem of structuring a textual database so as to optimize queries for potential duplicates.

Balanced clustering for efficient detection of scientific plagiarism / A. Ceselli, R. Cordone, M. Cremonini - In: 8. Cologne-twente workshop on graphs and combinatorial optimization : CTW 09 / [a cura di] S. Cafieri, A. Mucherino, G. Nannicini, F. Tarissan, L. Liberti. - Paris : Ecole polytechnique ; CNAM, 2009. - pp. 163-170 (( Intervento presentato al 8. convegno Cologne-Twente Workshop on Graphs and Combinatorial Optimization (CTW) tenutosi a Paris nel 2009.

Balanced clustering for efficient detection of scientific plagiarism

A. Ceselli
Primo
;
R. Cordone
Secondo
;
M. Cremonini
Ultimo
2009

Abstract

Duplication, co-submission and plagiarism are rising phenomena in modern scientific publishing, as the number of peer-reviewed journals and the perceived chances of escaping detection are increasing. On the other side, electronic indexes and new text-searching tools such as the search engine eTBLASTmight provide an effective deterrent of unethical publications. Though manual inspection is unavoidable in the end, automatic detection might strongly reduce the work required. However, the size of online databases makes a full search impractical even by algorithmic tools. In this paper, we consider the problem of structuring a textual database so as to optimize queries for potential duplicates.
Clustering; Dynamic programming; Tabu search; Branch-and-bound
Settore INF/01 - Informatica
2009
http://www.lix.polytechnique.fr/ctw09/
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
CTW09-proceedings.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 2.78 MB
Formato Adobe PDF
2.78 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/73087
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact