Article at-a-glance
– Humanizers primarily work by making the content a bit worse and adding some minor mistakes.
– SurferSEO reduces AI detection significantly: 16% AI probability on GPTZero, 56% on ZeroGPT (compared to 66% and 84% for DeepL);
– DeepL has higher AI detection scores: 66% on GPTZero and 84% on ZeroGPT, but produces cleaner, more professional writing than SurferSEO;
– Correcting SurferSEO copy increases AI scores: 52% on GPTZero and 69% on ZeroGPT after fixing language issues;
– ABBA AI-ism removal from DeepL lowers GPTZero scores: from 63% to 41%, but has minimal impact on ZeroGPT;
– If you’re blogging and want to beat AI checkers without quality concerns, use SurferSEO unedited; for cleaner copy, improve SurferSEO with basic editing (it’ll still give you lower AI scores);
– Human copy remains bulletproof and has the highest chance of being identified as non-AI by both testers;
– A good compromise is to use a mix of AI drafting and human editing and rewriting – that will give you lower overall AI scores and decent copy quality.
One day robots may want to become humans, but right now companies are trying to fool AI detectors with tools that make your AI-created text look like it was written by a human.
AI Humanizers claim to make your AI copy pass AI tests – essentially successfully cheating the AI checkers.
But are they successful? Do they really fool the testers? Is the resulting copy any good? Is there any way to further improve the copy?
Answers to those questions (and more) incoming… brace yourselves.
Here’s what we did to discover the effectiveness of content ‘humanizers’:
- Generated outrageously AI-ish blog copy and had our team re-write it from scratch (to create two control groups – one 100% AI, the other 0% AI) – for more on this see our GPTZero vs ZeroGPT test, it describes how we did that in a bit more detail;
- Ran each AI sample through DeepL and SurferSEO to produce two “humanized” variants;
- Tested both control samples and “humanized” copies against GPTZero and ZeroGPT (two of the most popular GPT testers out there);
- Averaged out the scores and compared;
- Analyzed the humanized output, and ran it through Grammarly and our own ABBA AI detection tool to see if it’s any better.
Here’s the blunt data, open access so anyone can play with it on their own:
- Casual blog tests: https://docs.google.com/spreadsheets/d/1d2f0dVIVivRNFdKrpD6P9jEXEg5HWEvEs7HsdKMaqHo/edit?usp=sharing
- Non-blog human content tests: https://docs.google.com/spreadsheets/d/1N5-gICyW6tstau5ivY6mtLSFINu3Q0TjovI2yWszciY/edit?gid=0#gid=0
And here’s what we found out…
1. Both humanizers lower AI scores, but SurferSEO is better at that than DeepL
- SurferSEO managed to score a whopping 16% AI on GPTZero against a lower-than-control-but-still-relatively high 66% by DeepL.
- Against the stricter, more accurate ZeroGPT, SurferSEO still managed a respectable 56% AI probability on average (vs the way poorer 84% by DeepL).
- To put things into perspective, SurferSEO managed lower scores on ZeroGPT than DeepL’s GPTZero scores, which are much more lenient in general. Fricking nice work, SurferSEO!
2. SurferSEO copy quality is slightly worse, with intentional basic mistakes
Our editors looked at the two versions of the copy and concluded that SurferSEO performs worse than DeepL in terms of sheer writing quality, but it’s mostly minor. Here are some examples and a breakdown:
- Awkward wording: “Today AI algorithms can diagnose cancer and heart disease with a precision that often beats the human.” – say what? Let’s keep this a violence-free space.
- Confusing sentences: “Interlink your posts to guide readers to related content on your blog and search engines to understand your site structure” – it’s difficult to understand what “and search engines to understand your site structure” refers to (the entire sentence is painful to read overall); “and help search engines” would have been clearer.
Some frequent issues:
- Useless repetitions: “After cleansing apply a moisturiser to lock in moisture especially in dry or cold weather”;
- Or how about this one: “Artificial intelligence is making healthcare better, more efficient, more personal and more accessible. [2 sentences later:] And AI powered platforms are making treatments more personal.” – I kid you not, it takes some smart prompting to get ChatGPT to be that repetitive
- Sheer nonsense: “Sun protection is year round; a daily SPF of 30 or more will protect your skin from UV rays” (the idea was that sun protection should be worn year-round… that you’re protected by default against the sun, year-round!)
- Lack of essential commas: “ Schedule in learning whether through reading or educational games and then breaks for physical activity to burn off energy and focus.” This might seem like a small thing, but it’s beginner-level, ultra-amateur to have that long sentence without any commas. Just try reading it aloud… if you’re not breathless your fitness is at least above average.
- Another nice one: “It can be hard to fit in fitness when you’re busy but there are a few simple exercises you can do. First try bodyweight squats.” If you read “First try” as in “the first try”, you’re not alone – that’s a perfectly normal response to the lack of comma after “First”.
- Complete disregard for essential dashes: “AI driven wearables can monitor patients at home and send alerts to healthcare providers if something is wrong.”; “Cash based spending on groceries and entertainment can also keep you aware of what you’re spending.”; “Assign age appropriate chores to teach responsibility.”
- Sure, it’s perfectly fine to omit them – if you want the copy to be a bit more confusing and break minor grammar rules to pass AI detection.
- Ungrammatical sentences. Like this one: “Then a bedtime routine with winding down activities like story time or gentle music”.
If that feels incomplete it’s because it is incomplete – nothing is happening in that sentence, there are no verbs. Verbs turn strings of words into sentences, and sentences without verbs are either the mark of a literary genius (not the case here), or of a poor writer (not the case here either). Here it’s simply an issue with SurferSEO seeming to implement intentional mistakes to fool AI testers… not cool!
- More ungrammatical stuff: “Push-ups are another good one that works your chest, shoulders and arms and engages your core.” – Push-ups are ONE that does this and that… and the bad parallel structure is just the icing on the cake (should be “works your chest, shoulders, and arms, and engages your core”, but even that’s not good – keeping it simpler with “a good workout for the core, chest, shoulders, and arms” is much better.
When you fix the mistakes, it worsens the AI score!
We believe the mistakes are built into the platform; we’re fairly certain that’s the case because we’ve corrected them and ran the copy through the checkers once more; and what do you know, the AI scores jumped up significantly:
- When SurferSEO copy was corrected, GPTZero rated its AI probability as 52% (vs 16% in the uncorrected version), with ZeroGPT seeing a slighter increase: 69% AI probability for the corrected copy from an initial 56% (keep in mind ZeroGPT was more accurate in the first place);
- Notably, these scores are still better than DeepL’s (52% and 69% for SurferSEO vs 66% and 84% for DeepL).
3. SurferSEO copy is free of AI-isms overall, and DeepL copy can be made less AI-ish with Ampifire’s ABBA
Running the copy through AmpiFire’s ABBA showed that the articles were overall free of AI cliches, but the few AI-isms that remained did make a difference.
SurferSEO only had one sample with cliches, where fixing them had absolutely zero impact.
For 5 of the DeepL samples where we found red-flag AI-sms (stuff like “explore”, “discover”, “not only… but also” and the like), removing those and replacing them with alternatives dropped the GPTZero score from 63% in the pre-ABBA phase to 41%, which is quite impressive.
ZeroGPT was harder to trick though, and removing the AIsms didn’t help (there was an increase of 3% after optimizing the copy… which is probably not significant considering the small sample; but does show that ZeroGPT is overall more robust).
What does this all mean?
First of all, the biggest conclusion here is that AI humanizers do work; both DeepL and SurferSEO did at least something to reduce AI scores, and SurferSEO was actually impressively effective in that regard, BUT – and this is a huge but, so it deserves its own paragraph:
SurferSEO relies on intentionally sloppier writing to bypass AI checkers (AI copy being notoriously clean, just as clean as it is boring). If you fix the copy to make it usable, you’re going to have higher AI scores… but still lower than if you used DeepL.
SurferSEO copy is pretty much free of AI-isms when tested against AmpiFire’s ABBA, while DeepL does have them (50% of DeepL samples had at least one red-flag AI word or phrase). Fixing those helps reduce AI scores when checked with GPT Zero, but not with ZeroGPT.
Here are some use cases for these tools:
- If your main goal is beating the AI checkers and you don’t really care for quality in your casual blogs, then just go with SurferSEO unedited copy.
- If you’re looking for cleaner copy, you can always improve SurferSEO with a simple Grammarly check plus basic human editing – it’ll still be better than pure AI copy;
- If you want cleaner copy right off the bat, then DeepL is a cleaner choice. You won’t need to fix basic writing errors, but the penalty for that convenience is a higher AI score. You can lower it to some extent by running it through ABBA, and it’ll be better than pure AI, so it’s not all bad.
- If you’re looking for 0% AI copy, then your best bet is simply to not use AI tools like Shortly AI, Outwrite AI to write your copy. You can use it to generate ideas and organize your drafts, but do the final writing yourself – GPTZero might raise the eyebrow slightly with a 2-3% AI, but ZeroGPT will know it’s fully human content.
Until then, AmpiFire can help you drive more visibility to your content with quality content development and distribution – get in touch today to see what we can do for you!
Authors
-
-
CEO and Co-Founder at AmpiFire. Book a call with the team by clicking the link below.