Plagiarism is not new, but technology is providing new opportunities to plagiarize; from cutting and pasting directly from another source to soliciting others to complete entire assignments on the student’s behalf. Technology also provides opportunities to identify potential cases of plagiarism. Text matching software, such as Turnitin®, can generate similarity reports indicating the original source when text has been submitted without appropriate acknowledgement. Enter the latest technology, a mechanism to deceive this word matching software – online paraphrasing tools.
In recent times, in our Health Sciences course, we have encountered essays from students exhibiting language which is unidiomatic at best, incomprehensible at worst. As many of these students have English as an Additional Language, we assumed they were preparing the essays in their first language and then using online language translation tools to convert the text to English.
The serendipitous discovery of the existence of free online English-to-English paraphrasing tools exposed our naivety. It became apparent that students were taking text from other sources and ‘spinning’ it through software which automatically substituted synonyms, to the extent that no similarity to the original source could be established.
For example, the sentence:
One day while Doug was out walking, he felt lightheaded and then lost consciousness and fell to the ground.
when spun by an online paraphrasing tool would become:
One sidereal day, while Doug was out walk, he felt lightheaded and then lost knowingness and downslope to the pulverization.
Our naïve assumption that students were converting their own text through language translation software at least indicated the work was of their own intellectual merit. The use of paraphrasing tools, however, suggested a source which they wished to conceal.
This was highlighted when we encountered an essay describing Computerized Axial Tomography (CAT) scans as x-rays taken from various different angels (a misspelling of the word angles). Another essay described CAT scan images taken from various different blessed messengers. It became apparent that the first essay was the source from which the second essay had been spun through paraphrasing tools.
Absence of discipline-based terminology as a marker of plagiarism
The question then arose, can we identify markers which will differentiate text which has been subject to a language translation tool from that which has been altered through an English-to-English synonym substitution tool?
To explore this, we selected a corpus of text which featured a significant amount of medical terminology. Standardized medical terminology is used throughout healthcare to reduce ambiguity and promote clear communication. Students are required to become immersed in this discourse, and the expectation is that these medical terms will be used by students unaltered and without synonym substitution. This text was subject to iterative translations in six languages common to our EAL students through Google Translate; and spun through six free online paraphrasing tools.
The results demonstrated that of the 21 standard medical terms included in the text, there were 73 synonyms generated and substituted by the paraphrasing tools, and only 7 alternative terms from Google Translate. For example, when the term hospital was subjected to paraphrasing tools the following synonyms emerged: healing facility, doctor’s facility, healing center, mending office and sanatorium. No alternative text was provided by Google translate.
This study demonstrates that while online paraphrasing tools may create content that deceives text matching software such as Turnitin®, the linguistic output is, at times, unintelligible. More specifically, where expected discipline-based terminology is replaced by contextually inappropriate synonyms, this is more indicative of the use of online paraphrasing tools than language translation tools.