Teachers and educational institutions have wracked their heads to determine how they can identify AI text in their students’ assignments, but we’re still to arrive at a reasonable solution. From research papers to news articles, very obvious and embarrassing generative AI faux pas have come to light, however, those are just the most obvious instances of AI. It’s possible that more AI-generated texts and results have gone unnoticed and unverified. Some tools for AI-generated text identification are available online, but you can feed these tools your own writing and see how ineffective they can be sometimes. 

Regardless of how smart artificial intelligence becomes, detecting AI text and separating it products of human thought and skill is essential for the general public to become more accepting of AI. Certain sources have hypothesized that they’ve been able to track common AI-generated text clues that set AI text apart but is it foolproof? Not yet.

Detecting AI text

Image: Pexels

Can We Identify AI Text? Researchers Delve Into the Issue

An article by Ars Technica took on the topic of detecting AI text and it did so by exploring a scientific paper from researchers at the University of Tubingen and Northwestern University. The paper suggested that since 2023, when LLMs truly became a popular tool, the use of certain works skyrocketed in a way that didn’t necessarily follow any natural growth patterns observed in the years before. The word “delves” was used 25 times more in 2024 paper abstracts compared to pre-LLM research reports. “Showcasing” and “underscores” were other words that stood out in the study. 

If you’re suddenly caught by the fear that you write like an AI bot, you’re not alone. The words highlighted in the study are not “unusual” by any means, especially not in the context of a research paper, however, if the words stand out as markers of AI, you might be forced to update your vocabulary. The risk of being caught up with false positives is too great to ignore in a world that’s growing increasingly enraged about the use of AI. 

Many online ChatGPT detection tools claim to be able to identify AI text but they’re not accurate 100 percent of the time. ZDnet conducted its own investigation into some of these AI detectors, and while the tools performed better than expected, their accuracy wasn’t consistent enough to use them as a way to separate human written content from AI.

Another advancement in the category of detecting AI text is the Columbia Engineering Raidar (geneRative AI Detection via Rewriting) tool. According to their research, AI tools are more willing to rewrite human writing compared to writings by AI. By putting a piece of writing through the tool, one might be able to predict if something is written by AI when only minimal attempts are made by the AI to edit it. This is a fascinating approach to detecting AI text and we’ll have to keep watching to see how this theory evolves.

Google Takes on the Challenge of AI-generated Text and Video Identification

Google recently announced its revitalized SynthID watermark imprinting system that can mark AI-generated text and video with an invisible watermark. Detecting clues within AI-generated text is risky business as these AI tools are only going to get more refined with the kind of content it is able to generate. It may be able to imitate human writing well enough that looking for marker words within the text could leave us with misleading results.

 Instead, if watermarks or identifiers are embedded into the content in some way, it can be much easier to tell when content is created by AI. Transparency is essential going forward and we need more ways to set clear boundaries between what is and isn’t AI. 

Should We Create More Tools for Identifying AI-Generated Text?

The more serious we grow about detecting AI text and picking up on AI-generated text clues, the more clever AI users will become about evading detection, slowly eliminating those specific words from their generated material. Similarly, the more aware we are of flaws in AI, the faster the parents of AI will work to sprinkle their AI with the “human touch.” Despite the circular nature of the problem, we do have to get more serious with determining how to identify AI texts, videos, and images.

Generative AI text models can be useful tools for those who need language support to get their ideas across more precisely, but the problem arises when the tools are used with abandon. Users who don’t verify the content of the generated material or ensure its accuracy are likely to spread false information and take credit for work that was never done by them. This could be a serious problem later down the line. Writers and researchers with genuine material may be forced to turn to the potential of AI to generate content at a much quicker pace, impeding real creativity and the desire for knowledge. 

We do need to be able to identify AI-generated text as accurately as possible but until we have a model that works accurately almost 100 percent of the time, we should use these detectors sparingly to ensure the general public is not wrongfully accused of being overly reliant on AI. Simultaneously, those who rely on AI-generated text should exercise the utmost caution in verifying the veracity of the content and adding their own touch to the material every time.