Training

December 3, 2024 Last Updated

DeepL vs SurferSEO: Best AI Humanizers Compared | Do They Really Work?

Radu Diaconu

Article at-a-glance

– Humanizers primarily work by making content worse by adding minor mistakes.

– SurferSEO reduces AI detection significantly: 16% AI probability on GPTZero, 56% on ZeroGPT (compared to 66% and 84% for DeepL);

– DeepL has higher AI detection scores: 66% on GPTZero and 84% on ZeroGPT, but produces cleaner, more professional writing than SurferSEO;

– Correcting SurferSEO copy increases AI scores: 52% on GPTZero and 69% on ZeroGPT after fixing language issues;

– ABBA AI-ism removal from DeepL lowers GPTZero scores: from 63% to 41%, but has minimal impact on ZeroGPT;

– If you’re blogging and want to beat AI checkers without quality concerns, use SurferSEO unedited; for cleaner copy, improve SurferSEO with basic editing (it’ll still give you lower AI scores);

– Human copy has the highest chance of being identified as non-AI by both testers, but AI testing is still highly unreliable.

– A good compromise is to use a mix of AI drafting and human editing and rewriting – that will give you lower overall AI scores and decent copy quality. But aiming to lower AI scores for the sake of it is not a good idea due to their inherent innacuracy.

One day robots may want to become humans, but right now companies are trying to fool AI detectors with tools that make your AI-created text look like it was written by a human.

AI Humanizers claim to make your AI copy pass AI tests – essentially successfully cheating the AI checkers.

But are they successful? Do they really fool the testers? Is the resulting copy any good? Is there any way to further improve the copy?

Answers to those questions (and more) incoming… brace yourselves.

Here’s what we did to discover the effectiveness of content ‘humanizers’:

Generated outrageously AI-ish blog copy and had our team re-write it from scratch (to create two control groups – one 100% AI, the other 0% AI) – for more on this see our GPTZero vs ZeroGPT test, it describes how we did that in a bit more detail;
Ran each AI sample through DeepL and SurferSEO to produce two “humanized” variants;
Tested both control samples and “humanized” copies against GPTZero and ZeroGPT (two of the most popular GPT testers out there);
Averaged out the scores and compared;
Analyzed the humanized output, and ran it through Grammarly and our own ABBA AI detection tool to see if it’s any better.

Here’s the blunt data, open access so anyone can play with it on their own:

Casual blog tests: https://docs.google.com/spreadsheets/d/1d2f0dVIVivRNFdKrpD6P9jEXEg5HWEvEs7HsdKMaqHo/edit?usp=sharing
Non-blog human content tests: https://docs.google.com/spreadsheets/d/1N5-gICyW6tstau5ivY6mtLSFINu3Q0TjovI2yWszciY/edit?gid=0#gid=0

And here’s what we found out…

GPTZero and ZeroGPT scores for AI-humanized copy with SurferSEO and DeepL.

“To Bypass AI detection making the content worse with minor mistakes is the go-to strategy. Using AI detectors & “humanizing” content is generally a bad idea if your goal is quality.”

1. Both humanizers lower AI scores, but SurferSEO is better at that than DeepL

SurferSEO managed to score a whopping 16% AI on GPTZero against a lower-than-control-but-still-relatively high 66% by DeepL.
Against the stricter, more accurate ZeroGPT, SurferSEO still managed a respectable 56% AI probability on average (vs the way poorer 84% by DeepL).
To put things into perspective, SurferSEO managed lower scores on ZeroGPT than DeepL’s GPTZero scores, which are much more lenient in general. Fricking nice work, SurferSEO!

2. SurferSEO copy quality is slightly worse, with intentional basic mistakes

Our editors looked at the two versions of the copy and concluded that SurferSEO performs worse than DeepL in terms of sheer writing quality, but it’s mostly minor. Here are some examples and a breakdown:

Awkward wording: “Today AI algorithms can diagnose cancer and heart disease with a precision that often beats the human.” – say what? Let’s keep this a violence-free space.
Confusing sentences: “Interlink your posts to guide readers to related content on your blog and search engines to understand your site structure” – it’s difficult to understand what “and search engines to understand your site structure” refers to (the entire sentence is painful to read overall); “and help search engines” would have been clearer.

Some frequent issues:

Useless repetitions: “After cleansing apply a moisturiser to lock in moisture especially in dry or cold weather”;
- Or how about this one: “Artificial intelligence is making healthcare better, more efficient, more personal and more accessible. [2 sentences later:] And AI powered platforms are making treatments more personal.” – I kid you not, it takes some smart prompting to get ChatGPT to be that repetitive
Sheer nonsense: “Sun protection is year round; a daily SPF of 30 or more will protect your skin from UV rays” (the idea was that sun protection should be worn year-round… that you’re protected by default against the sun, year-round!)
Lack of essential commas: “ Schedule in learning whether through reading or educational games and then breaks for physical activity to burn off energy and focus.” This might seem like a small thing, but it’s beginner-level, ultra-amateur to have that long sentence without any commas. Just try reading it aloud… if you’re not breathless your fitness is at least above average.
- Another nice one: “It can be hard to fit in fitness when you’re busy but there are a few simple exercises you can do. First try bodyweight squats.” If you read “First try” as in “the first try”, you’re not alone – that’s a perfectly normal response to the lack of comma after “First”.

Complete disregard for essential dashes: “AI driven wearables can monitor patients at home and send alerts to healthcare providers if something is wrong.”; “Cash based spending on groceries and entertainment can also keep you aware of what you’re spending.”; “Assign age appropriate chores to teach responsibility.”
- Sure, it’s perfectly fine to omit them – if you want the copy to be a bit more confusing and break minor grammar rules to pass AI detection.

Ungrammatical sentences. Like this one: “Then a bedtime routine with winding down activities like story time or gentle music”.

If that feels incomplete it’s because it is incomplete – nothing is happening in that sentence, there are no verbs. Verbs turn strings of words into sentences, and sentences without verbs are either the mark of a literary genius (not the case here), or of a poor writer (not the case here either). Here it’s simply an issue with SurferSEO seeming to implement intentional mistakes to fool AI testers… not cool!

More ungrammatical stuff: “Push-ups are another good one that works your chest, shoulders and arms and engages your core.” – Push-ups are ONE that does this and that… and the bad parallel structure is just the icing on the cake (should be “works your chest, shoulders, and arms, and engages your core”, but even that’s not good – keeping it simpler with “a good workout for the core, chest, shoulders, and arms” is much better.

When you fix the mistakes, it worsens the AI score!

We believe the mistakes are built into the platform; we’re fairly certain that’s the case because we’ve corrected them and ran the copy through the checkers once more; and what do you know, the AI scores jumped up significantly:

When SurferSEO copy was corrected, GPTZero rated its AI probability as 52% (vs 16% in the uncorrected version), with ZeroGPT seeing a slighter increase: 69% AI probability for the corrected copy from an initial 56% (keep in mind ZeroGPT was more accurate in the first place);
Notably, these scores are still better than DeepL’s (52% and 69% for SurferSEO vs 66% and 84% for DeepL).

3. SurferSEO copy is free of AI-isms overall, and DeepL copy can be made less AI-ish with Ampifire’s ABBA

Running the copy through AmpiFire’s ABBA showed that the articles were overall free of AI cliches, but the few AI-isms that remained did make a difference.

SurferSEO only had one sample with cliches, where fixing them had absolutely zero impact.

For 5 of the DeepL samples where we found red-flag AI-sms (stuff like “explore”, “discover”, “not only… but also” and the like), removing those and replacing them with alternatives dropped the GPTZero score from 63% in the pre-ABBA phase to 41%, which is quite impressive.

ZeroGPT was harder to trick though, and removing the AIsms didn’t help (there was an increase of 3% after optimizing the copy… which is probably not significant considering the small sample; but does show that ZeroGPT is overall more robust).

Running DeepL copy through AmpiFire’s ABBA AI detector can further lower AI score when the copy is tested on GPTZero.

What does this all mean?

First of all, the biggest conclusion here is that AI humanizers do work; both DeepL and SurferSEO did at least something to reduce AI scores, and SurferSEO was actually impressively effective in that regard, BUT – and this is a huge but, so it deserves its own paragraph:

SurferSEO relies on intentionally sloppier writing to bypass AI checkers (AI copy being notoriously clean, just as clean as it is boring). If you fix the copy to make it usable, you’re going to have higher AI scores… but still lower than if you used DeepL.

SurferSEO copy is pretty much free of AI-isms when tested against AmpiFire’s ABBA, while DeepL does have them (50% of DeepL samples had at least one red-flag AI word or phrase). Fixing those helps reduce AI scores when checked with GPT Zero, but not with ZeroGPT.

Here are some use cases for these tools:

If your main goal is beating the AI checkers and you don’t really care for quality in your casual blogs, then just go with SurferSEO unedited copy.
If you’re looking for cleaner copy, you can always improve SurferSEO with a simple Grammarly check plus basic human editing – it’ll still be better than pure AI copy;
If you want cleaner copy right off the bat, then DeepL is a cleaner choice. You won’t need to fix basic writing errors, but the penalty for that convenience is a higher AI score. You can lower it to some extent by running it through ABBA, and it’ll be better than pure AI, so it’s not all bad.
Beating AI scores is not a good goal as they are unreliable. Google doesn’t care, quality is what counts, and clean informational writing is often detected as some level of AI content.
If you’re looking for 0% AI copy, then your best bet is simply to not use AI tools like Shortly AI, Outwrite AI to write your copy. Just bear in mind that even when writing copy with human writers, if it is well structured and more informational it can be detected as AI anyway.

Until then, AmpiFire can help you drive more visibility to your content with quality content development and distribution – get in touch today to see what we can do for you!

Authors

Radu Diaconu

View all posts
Chris Munch

CEO and Co-Founder at AmpiFire. Book a call with the team by clicking the link below.

View all posts

SHARE ON:

Reports

The Small Business Digital Marketing Business Model: Predictions For The New Normal

Ultimately crisis or no crisis I use the 7 factors of a winning industry to decide the bigger picture direction of where I want to go.

April 6, 2020 No Comments

Reports

The New Normal & The Intensity & Lingering Effect of Lockdowns

But now quarantines, lockdowns and social distancing have become the new normal. While many are waiting for things to go back to how they are, we can’t be certain that can happen.

April 10, 2020 No Comments

Reports

How to Survive & Prosper Through The Coming Economic Disaster?

There are opportunities for online freelancers, people new to online businesses, any business with a website, and existing online businesses.

June 2, 2020 No Comments

Reports

Which Business Niches Are Growing During The Virus Outbreak?

Firstly let’s take a look at how the virus impacted traffic of various industries…

June 2, 2020 No Comments

Reports

Profiting From Fear During A Crisis: The Hand Sanitizer Business Lesson Gone Wrong

Matt and Noah Colvin. Amazon. Hand sanitizer. You wouldn’t think those three together would make for an interesting story, but they do.

June 2, 2020 No Comments

Reports

Online Business Models For The Pandemic Crisis That Are Continuing To Prosper

Now let’s dive a little deeper into specific business models that are continuing to prosper during the outbreak and why…

June 2, 2020 No Comments

DeepL vs SurferSEO: Best AI Humanizers Compared | Do They Really Work?

Radu Diaconu

Here’s what we did to discover the effectiveness of content ‘humanizers’:

And here’s what we found out…

“To Bypass AI detection making the content worse with minor mistakes is the go-to strategy. Using AI detectors & “humanizing” content is generally a bad idea if your goal is quality.”

1. Both humanizers lower AI scores, but SurferSEO is better at that than DeepL

2. SurferSEO copy quality is slightly worse, with intentional basic mistakes

3. SurferSEO copy is free of AI-isms overall, and DeepL copy can be made less AI-ish with Ampifire’s ABBA

What does this all mean?

Authors

The Small Business Digital Marketing Business Model: Predictions For The New Normal

The New Normal & The Intensity & Lingering Effect of Lockdowns

How to Survive & Prosper Through The Coming Economic Disaster?

Which Business Niches Are Growing During The Virus Outbreak?

Profiting From Fear During A Crisis: The Hand Sanitizer Business Lesson Gone Wrong

Online Business Models For The Pandemic Crisis That Are Continuing To Prosper

Book Your AmpiFire Appointment

Directory

Important Links

Stay Connected

DeepL vs SurferSEO: Best AI Humanizers Compared | Do They Really Work?

Radu Diaconu

Here’s what we did to discover the effectiveness of content ‘humanizers’:

And here’s what we found out…

“To Bypass AI detection making the content worse with minor mistakes is the go-to strategy. Using AI detectors & “humanizing” content is generally a bad idea if your goal is quality.”

1. Both humanizers lower AI scores, but SurferSEO is better at that than DeepL

2. SurferSEO copy quality is slightly worse, with intentional basic mistakes

3. SurferSEO copy is free of AI-isms overall, and DeepL copy can be made less AI-ish with Ampifire’s ABBA

What does this all mean?

Authors

Related Posts

Book Your AmpiFire Appointment

Directory

Important Links

Stay Connected