Most people would probably like to think that they can easily distinguish between human-written and AI-generated text.

Surely there should be a notable difference between writing that a human has put time, care, and thought into and something that was generated in seconds by a chatbot.

However, research suggests that people often overestimate their ability to identify AI-generated text. For example, a study conducted in 2024 showed that teachers felt confident in their abilities to detect AI-generated work, yet were only able to accurately identify it less than half the time. Furthermore, according to Cornell University, human accuracy decreases with the introduction of new models, suggesting that this issue will only worsen over time as more advanced AI systems are developed. Even more concerningly, in a Turing test, GPT-4 was classified as human 54% of the time.

One way of confirming whether a piece of writing is AI-generated is by using AI detectors, which are increasingly being used in academic settings.

However, AI detectors often produce high rates of false positives and can be biased, frequently and disproportionately misidentifying the work of non-native English speakers as AI-generated. Large language models typically produce text with low perplexity, meaning that individuals who use a lot of common vocabulary or structures in their writing are at a greater risk of being flagged for AI use.

This could have serious consequences for students and academics and suggests that we should not be overly reliant on AI detectors to verify if text is human-written. Additionally, there are ways to bypass AI detectors. For example, editing AI-generated content or prompting AI models to use more literary language can reduce the likelihood of detection.

GlobalData Strategic Intelligence

US Tariffs are shifting - will you react or anticipate?

Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.

By GlobalData

Why is this a problem?

As people struggle to identify AI-generated work, it can easily go unnoticed by average readers. Originality.ai found that approximately 11% of articles posted on the official blogs of Fortune 500 companies are potentially AI-generated.

Additionally, in August 2023, NewsGuard identified 37 websites that used chatbots to rewrite articles that first appeared in news outlets such as The New York Times, CNN, and Reuters, without crediting the original sources. NewsGuard also uncovered 1,254 AI-generated news and information sites that appear to operate with little to no human oversight. An even more worrying report from Europol suggested that “as much as 90 percent of online may be synthetically generated by 2026.”

This has various implications. Generative AI models are trained using existing works, which raises serious copyright concerns. Several parties, including the Authors Guild and notable authors such as George R. R. Martin, are currently suing OpenAI, the creator of ChatGPT, for copyright infringement. A group of eight newspapers is also taking legal action against OpenAI and Microsoft for using millions of copyrighted news articles to train their AI models without permission or compensation.

Readers browsing these websites may not realise that the content they are consuming is AI-generated or stolen from another source. These articles may lack proper fact-checking, and given that AI frequently hallucinates, there is a high chance that the information provided in such articles could be misleading or incorrect. This will only exacerbate the spread of misinformation online.

The issue extends beyond just readers; writers must also be cautious about their use of AI. Overreliance on AI tools can hinder the development of critical thinking, memory, and language skills. A study by researchers at MIT found that people who relied on ChatGPT to write their essays exhibited lower brain activity than those who wrote essays without assistance. Furthermore, the group that used AI performed worse and struggled when they were required to complete tasks without using AI. This raises concerns about the long-term effects of AI usage, not just in schools and the workplace, but also in everyday life. There are many tasks that we can offload onto AI, but thinking should not be one of them.

Signs of AI-generated writing

As the internet becomes increasingly saturated with covertly AI-generated content, developing critical thinking skills will be essential for determining what has been produced artificially. There are several recognisable signs that a piece of writing may be AI-generated, alongside a general bland tone.

One indication of AI-generated work is an overuse of the em dash. While the em dash is a versatile punctuation mark, it should be used sparingly for maximum effect. However, AI chatbots tend to use the em dash prolifically—to the point where it has been dubbed the “ChatGPT dash.”

In fact, readers have become so wary of this punctuation mark that people have been accused of passing off AI-generated content as their own based solely on the presence of a single em dash.

AI models also overuse tricolons, a rhetorical device that consists of a series of three parallel words, phrases, or clauses. For example, “Detecting AI writing is like searching for a needle in a haystack, where clarity, originality, and truth often get lost in the chaos,” is AI-generated and demonstrates a structure that AI tends to employ.

This structure becomes quite obvious once you’ve noticed it, as AI models often favour lists of threes in general. A single instance of tricolon does not necessarily guarantee that something was AI-written, although a proliferation of this structure strongly suggests AI involvement. Furthermore, while the metaphors used in these constructions may seem coherent at first glance, they often lack any real meaning or nuance.

Certain words and phrases are also heavily favoured by generative AI models. Commonly used words include “delve,” “dive” “discover,” “realm,” “tapestry,” “robust,” “crucial,” “ultimately,” “utilise,” “leverage,” and “quinoa.” Frequently used phrases include “in today’s world,” “it is important to note that,” “only time will tell,” “rapidly developing,” and “significantly enhances.” Excessive use of these words and phrases can suggest that text was generated by AI.

However, this has also led to a tendency for people to write off any piece of work containing these works (especially “delve”) as AI-generated. This can be counterproductive, as some writers are genuinely just using these words. Moreover, as people become more aware of these words and phrases, they can just remove them after generating text with AI, making this sign essentially redundant.

Can we ever be truly sure whether something is AI-generated?

It is easy to spot some of the above signs in a piece of text and automatically assume that it was generated using AI. However, it is not always that simple. AI models are trained on existing work, which means their repetition of certain words, phrases, structures, and punctuation is partially a reflection of writers’ preferences for them in the first place.

Reducing written work to various indicators of AI-generated content, such as the so-called “ChatGPT dash,” undermines the author’s efforts and intentions.

Writers have always used these devices—that is why AI models regurgitate them. Consequently, genuine human-written work may be mistakenly accused of being AI-generated. This could foster a general mistrust of writing, making it difficult for people to discern whether a piece was created by a human or AI, especially online, where so many articles are published anonymously.

Therefore, while it is important to be aware that some of the content we encounter may be AI-generated and potentially less trustworthy, we cannot just accuse everyone of using AI based on the presence of certain words or punctuation. Similarly, writers and students should try to avoid overly relying on AI, as this may negatively impact their writing skills in the long term.

Are we actually that good at detecting AI-generated content?

US Tariffs are shifting - will you react or anticipate?

Why is this a problem?

Signs of AI-generated writing

Can we ever be truly sure whether something is AI-generated?

Go deeper with GlobalData

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Data Insights

Ubiquitous, resilient connectivity will be the tip of the spear for operators in 2026

Snowflake to acquire AI observability startup Observe

Cyera secures $400m Series F as demand for enterprise AI security grows

CrowdStrike to enhance AI-era identity security with SGNL acquisition

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

US Tariffs are shifting - will you react or anticipate?

Why is this a problem?

Signs of AI-generated writing

Can we ever be truly sure whether something is AI-generated?

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Data Insights

Access deeper industry intelligence

Ubiquitous, resilient connectivity will be the tip of the spear for operators in 2026

Snowflake to acquire AI observability startup Observe

Cyera secures $400m Series F as demand for enterprise AI security grows

CrowdStrike to enhance AI-era identity security with SGNL acquisition

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing