Wednesday, November 29, 2023

AI Use in Manuscript Preparation versus Originality.AI Plagiarism detector


[2311.14720] AI Use in Manuscript Preparation for Academic Journals

Nir Chemaya, Daniel Martin
The emergent abilities of Large Language Models (LLMs), which power tools like ChatGPT and Bard, have produced both excitement and worry about how AI will impact academic writing. In response to rising concerns about AI use, authors of academic publications may decide to voluntarily disclose any AI tools they use to revise their manuscripts, and journals and conferences could begin mandating disclosure and/or turn to using detection services, as many teachers have done with student writing in class settings. Given these looming possibilities, we investigate whether academics view it as necessary to report AI use in manuscript preparation and how detectors react to the use of AI in academic writing.
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); General Economics (econ.GN)
Cite as: arXiv:2311.14720 [cs.CY]
  (or arXiv:2311.14720v1 [cs.CY] for this version)
  https://doi.org/10.48550/arXiv.2311.14720

Submission history

Computer Science > Computers and Society

From: Daniel Martin [view email]
[v1] Sun, 19 Nov 2023 06:04:46 UTC (962 KB)

Summary

My initial reaction to this paper was that it was more a test of originality.ai than a real research project. Grammar checking is an author housekeeping chore that has been automated for sometime. A lot of us don't have research assistants to do it. Use of AI tools to generate verbiage based on a fuzzy query is something that should be checked. Grammar is not IMHO. The stigma of a false accusal of plagiarism means it should be hurled carefully. Originality.ai are aware of this, and seem to be adopting the approach used in law and art to detect fakes and forgeries, i.e., verified provenance of the creation process as described below. I used Claude.ai and ChatGPT to help evaluate the paper.

Based on the document, here is a summary of the key points about AI use in manuscript preparation for academic journals:

  1. There is excitement and concern about how emerging AI tools like ChatGPT and Bard could impact academic writing. These tools can help with editing text for things like grammar, style, clarity, etc. But there are concerns about potential mistakes, bias, and plagiarism issues.
  2. Given these concerns, academics may voluntarily disclose AI use or journals/conferences may mandate disclosure and use detection services. The authors investigate whether academics think disclosure is necessary and how detectors react to AI use in manuscripts.
  3. Key findings:
  • Academics were less likely to think fixing grammar with AI needs disclosure than rewriting text, but a detector often flagged grammar fixes as AI-written.
  • There was little difference in reporting preferences between ChatGPT and RA help, but differences with services like proofreading.
  • There were disagreements on whether rewriting with ChatGPT needs disclosure, related to perceptions of ethics, academic role, and English ability.
  1. Overall, there are open questions around what assistance needs disclosure, what types require it, and how policies can account for the limitations of detection tools. Perceptions also differed based on factors like views on ethics.

Authors

Unfortunately, the paper does not provide much background information about the authors Nir Chemaya and Daniel Martin or their previous work and institutional affiliations. Here are the few details given:

Author Details:

That is the full extent of author information included in the paper's content. The paper does not include CV-style background on the authors, their areas of expertise, or descriptions of institutions beyond the name.

There are also no references or citations to any of the authors' prior published works. So there is no information provided on what research they have conducted in the past on related or other topics.

In the acknowledgements, the authors thank various members of the UCSB Economics Department, the Economic Science Association (ESA), and the Decision Theory Forum. This suggests the authors likely have backgrounds in economics, decision science, experimental methods, and related fields.

 

What ChatGPT does to correct Grammar 

I asked ChatGPT the following: If asked to check an academic paper for grammatical errors, what would you check for.

When checking an academic paper for grammatical errors, I would focus on:

  1. Grammar and Syntax: Ensure proper sentence structure and grammar usage.

  2. Punctuation: Check for correct usage of commas, periods, semicolons, and other punctuation marks.

  3. Spelling: Verify the correctness of all words and ensure they are spelled properly.

  4. Verb Tense Consistency: Ensure consistency in the use of verb tenses throughout the paper.

  5. Subject-Verb Agreement: Confirm that subjects and verbs agree in number (singular/plural).

  6. Article Usage: Check the proper use of articles (a, an, the).

  7. Concord: Ensure agreement between different elements in a sentence (e.g., pronouns and antecedents).

  8. Sentence Fragments and Run-On Sentences: Identify and correct any incomplete sentences or run-on sentences.

  9. Word Choice: Verify that the chosen words convey the intended meaning accurately.

  10. Capitalization: Ensure proper capitalization of names, titles, and the beginning of sentences.

  11. Quotation Marks: Confirm the correct use of quotation marks for direct speech and citations.

  12. Consistent Style: Check for consistent use of formatting, citation styles, and other academic writing conventions.

Remember, thorough proofreading involves multiple passes to catch different types of errors.

What did the Authors of this paper do

The authors tested an AI detection service's ability to flag text that was revised by ChatGPT. Here are the key details on how they evaluated ChatGPT usage:

  1. Collected 2,716 abstracts of papers published in the journal Management Science over the past 10 years.
  2. Used GPT-3.5 (very similar to ChatGPT) to revise the abstracts using prompts from a GitHub guide on using ChatGPT for academic writing. For example, prompts asking GPT-3.5 to "correct the grammar" or "rewrite this paragraph in an academic language".
  3. Ran the original and GPT-3.5 revised abstracts through a commercial AI detector called Originality.ai. This tool gives each text an "AI score" estimating the likelihood it was AI-generated.
  4. Compared the AI scores for original vs revised abstracts. The revised versions scored much higher, indicating the detector successfully identified changes made by GPT-3.5.
  5. Examined whether detection scores differed between grammar corrections vs full rewrites. Grammar fixes actually scored higher than rewrites in some cases.

So in summary, they didn't check for actual ChatGPT usage in the wild per se. But tested whether an AI detector could accurately flag text that was edited by a very similar LLM tool to ChatGPT. The high AI scores for revised text indicated successful detection of changes made by AI.

What did Writers do before ChatGPT

The paper did not directly compare the detection of AI use between ChatGPT and Microsoft Word or any of the third party grammar correctors. The detection analysis focused solely on using GPT-3.5 to either fix grammar or rewrite abstract text from academic papers.

Specifically, the authors found that when GPT-3.5 was used to fix grammar in abstracts, the resulting text was often flagged by the AI detector (Originality.ai) as having a high probability of being AI-generated. The 75th percentile AI score was very close to 100%, indicating the detector strongly believed many of those grammar-corrected abstracts were written by AI.

The authors did not test Microsoft Word's grammar correction feature. So unfortunately we don't know from this paper whether abstracts with grammar fixes from Word would be detected at the same rate as those corrected by GPT-3.5. The paper does show academics believed Word usage needed less disclosure than ChatGPT, but no direct detection comparison was made.

In summary, the key result was that an AI detector often flagged text that used GPT-3.5 for grammar fixes as likely AI-written, but we don't know if Microsoft Word would have produced similar detection rates or not. The paper did not analyze Word usage through the AI detector.


Microsoft Word had Grammar Checking

Microsoft Word has a built-in grammar checker that can help you identify and correct errors in your writing. To use the grammar checker, click the Review tab and then click Spelling & Grammar. Word will check your document for spelling and grammar errors and will display any errors that it finds in the Spelling & Grammar dialog box.

Here are some of the things that the grammar checker can check for:

  • Subject-verb agreement
  • Verb tense consistency
  • Pronoun agreement
  • Adjective and adverb placement
  • Preposition usage
  • Sentence structure

The grammar checker can also provide suggestions for improving the clarity and conciseness of your writing.

To fix an error, click the suggested correction in the Spelling & Grammar dialog box. You can also click Ignore to skip an error, or click Add to Dictionary to add a word to the dictionary so that Word will not flag it as an error in the future.

In addition to the built-in grammar checker, there are a number of third-party grammar checkers available for Microsoft Word. These checkers can provide additional features, such as the ability to check for style consistency and to suggest alternative phrasing.

Here are some of the most popular third-party grammar checkers for Microsoft Word:

  • Grammarly: Grammarly is a popular grammar checker that is available as a free web app and as a paid add-in for Microsoft Word. Grammarly can check for a wide range of grammar and style errors, and it can also suggest alternative phrasing. Grammarly for Microsoft Word
  • ProWritingAid: ProWritingAid is another popular grammar checker that is available as a paid add-in for Microsoft Word. ProWritingAid is a more powerful grammar checker than Grammarly, and it offers a number of features that are not available in Grammarly, such as the ability to check for plagiarism and to identify overused words. ProWritingAid for Microsoft Word
  • WhiteSmoke: WhiteSmoke is a comprehensive grammar checker that is available as a paid standalone program. WhiteSmoke is a powerful grammar checker that offers a number of features that are not available in other grammar checkers, such as the ability to translate your writing into other languages and to check for plagiarism. WhiteSmoke for Microsoft Word

Whether you use the built-in grammar checker or a third-party grammar checker, using a grammar checker can help you to improve the quality of your writing.

What does Originality.AI do? 

This tool has been independently reviewed favorably. Originality.AI is a no-frills, user-friendly option for AI and plagiarism detection. And it’s good at it. Since it was created and weaned to detect GPT-3, it might be biased to think that no mere human could write as perfectly as a machine. Somewhat like the Wuhan Institute of Virology, since they invented the disease of false AI accusation, they are now selling the cure to authors.

[following from the Originality.AI website]

Jon Gillham founded Originality.AI after selling a content marketing agency he had previously founded.

Originality.AI launched in November 2022 to be able to specifically detect GPT-3 generated content (before ChatGPT had launched).

Having been one of the earliest adopters of generative AI content for SEO purposes at scale through his agency he understood the wave that was coming which Chat GPT and GPT-4 have fully unleashed.

The world needs reliable AI detection tools, but no AI detection tool is ever going to be 100%  perfect. Users should understand the individual limitations of these tools so that they can wield them responsibly, which means the developers of AI detectors should be as transparent as possible about the capabilities and limitations of their detectors. 

That’s why we here at Originality have made this guide, which aims to provide you with an answer to the question of:  What AI content detector is the most accurate? Additionally, we are proposing a standard for testing AI detector effectiveness, along with the release of an Open Source tool to help increase the transparency and accountability with all AI content detectors. 

We hope to achieve this idealistic goal by…

  1. Open-sourcing a benchmark dataset to help researchers identify AI detection effectiveness
  2. Open-sourcing a research tool we developed to assist anyone (researcher, journalist, customer or other AI detector) in testing multiple AI detectors on their own (or our) benchmark dataset. 
  3. Providing detailed instructions and including the calculation in the tool to help identify the most important AI vs Original Human classifier efficacy metrics.

If you have been asked or want to evaluate an AI content detector's potential use case for your organization this article is for you. 

This guide will help you understand AI detectors and their limitations by showing you…

  • How AI detectors work
  • How to calculate AI detector's effectiveness
  • How to complete your own tests (using one of the open-sourced tools we provide)
  • What we think should and should not be considered AI content
  • How accurate our AI content detector is based on the testing we have done
  • If you can trust our AI detector's effectiveness
  • How all AI detectors stack up in terms of effectiveness by type of content

If you have any questions, suggestions, research questions or potential commercial use cases please contact us.

TLDR:

●    Originality.AI Launches Version2.0 improving from the previous Version 1.4

  • 4.3% Improvement in Detection Accuracy
  • 14.1% Decrease in False Positives(2.97% to 2.55%)

●    Across 4 Datasets Originality’s Latest Version Was the Most Accurate & Effective Detector in Each Test

  • 6 AI Content Detectors Were Tested on a new Challenging Benchmark Dataset
  • 4 AI Content Detectors (who have an API) were Tested on another Open Source Dataset
  • Originality.AI 2.0 was tested against all available Open Source Benchmark datasets

●    Open Source Tool and Benchmark Dataset for Efficient Detector Testing Developed and Released 

Originality.ai for the Writer - No More False Positives

As a writer generative AI models like Chat GPT are both a blessing and a curse. At Originality.AI we believe in the transparent use of both AI writing tools and AI content detectors. The use of AI content detectors needs to be balanced with tools that can help ensure everyone (including writers, editors, agencies and clients) avoid AI content detector false positives.

  • Prove False Positives are Wrong - With our free Google Chrome extension you can prove (with no extra effort) that you truly wrote the article by allowing clients to visualize the creation process. 
  • Shared Confidence With Clients - Originality.AI is the most accurate AI detector and you can easily share (via a link) the results of an AI content detection scan.
  • More Time Writing and Less Time Arguing - All of the features of Originality.AI are geared to ensuring there is trust (accurate AI content detection, report sharing, visualize the creation process) with you and your clients getting you paid faster.
  • Use the results of our in depth readability study to optimize every piece of content you create to the ideal readability score.
  • Fact Checking Aid to make your fact checking process quicker and more accurate.

 

No comments:

Post a Comment

Progress and Pitfalls: High-Power Lasers Face Critical Challenges in Counter-Drone Defense

Raytheon Intelligence & Space’s suite of counter unmanned aircraft system (CUAS) capabilities defend against agile unmanned threats and ...