Should Publishers Still Be Scared of Duplicate Content?

Posted on May 26, 2019 by in WordPress | 12 comments

Should Publishers Still Be Scared of Duplicate Content?

Duplicate content is the boogeyman of the SEO world. Depending on who you ask and what you read, the definition and scope of what constitutes duplicate content varies wildly. That’s why in this article we will break down what duplicate content really is and the misconceptions about what search engines (Google) will and won’t abide by.

What is Duplicate Content?

Well, it’s as simple as it sounds: duplicate content is content that’s in more than one place. Content that matches verbatim (or close to it) the words of another article, or at least, content that is similar in structure and verbiage to another.

What most people think of as duplicate content is a copy/paste of the same article to multiple places around the internet. You can see this kind of thing happen when content is scraped illegally to be posted on less-than-reputable websites, when content is syndicated from one place to another, or when you own multiple websites and post the same content for added reach.

But duplicate content is also the use of the same phrases and sections on a site. If you have a template for guest posts, for instance, that reads exactly the same except for the writer’s name and website, that’s duplicate content.

The thing is, between a quarter to a third of the internet is duplicate content. If there’s a website, then it’s content is somewhere else. No doubt, it’s been scraped, its content stolen and later reposted elsewhere. It has blurbs and snippets that have been reused.

But the question of the hour is whether or not any of that has caused Google to penalize it. Has the scraped and plagiarized site lost rankings because of the duplication?

The answer is probably not.

Despite popular belief, it’s pretty hard for duplicate content to get you into trouble. The topic is full of anecdotes, myths, urban legends, and folklore passed down from marketer to marketer over the years. And like any story or tale, it gets taller and more exaggerated as it’s told. Let’s see if we can find the kernels of truth in these urban legends and misconceptions.

Will Google Blacklist Your Site for Duplicate Content?

There are variations of this floating around everywhere. That having even one instance of duplicate content will put you on Google’s bad side.

Maybe it’s a blacklisting from Google, or maybe it’s a penalty and the site ranks lower in various query results. But what if it’s something out of your control? Scrapers take your content against your wishes. Or what if it’s something you do purposefully? Like re-posting guest articles or fleshing out a secondary site or even syndicating content. Perhaps you have a template you use for interviews with the same questions repeated week after week after week.

Is that duplicate content? 100% absolutely yes. Is Google going to blacklist/penalize your site for it? Probably not.

You see, it takes a lot for Google to blacklist a site. If you’re not hosting malware, phishing scams, or just straight-up spam, the likelihood of your being blacklisted is nil. And as for penalizing your site, Google has said numerous times, they do not penalize for duplicate content (or as he puts it, “duplicate content is not really treated as spam”).

This means that as Matt says in the video linked above, if there are two websites with the same content, their search algorithms will determine which website is the most relevant and provides the most value to the users, and then display that result.

In cases like this, Google knows scraped content. Those websites are easy to find for them and their algorithms. In fact, you’ve probably run across content you know what stolen before and saw how horrible the website was. Full of ads, badly formatted, poorly designed, and just a heinous experience altogether. And worst of all? Nothing else on the site helped you with what you were searching for except this one, tiny excerpt you found.

That’s why Google takes search intent into account so much. Even if you have duplicate content, if it’s valuable content (and the rest of the site is valuable to users, too), you will be displayed in search results over websites with the exact same article.

Google Penalizes Thin Content, Not Duplicate

The reason that your site would be prioritized in search rankings over the duplicates is that their websites are full of what is known as thin content.  That means that articles on these sites are short, the site itself is an unfocused mishmash of topics across many niches and industries, and it probably has an incredibly high bounce rate.

Or, in other words, they’re nearly useless articles on nearly useless sites.

However, it’s not just copy/paste scraper sites that create thin content. No, you can create plenty of thin content of your own without much trouble. So you need to be careful.

Keyword stuffing is the first way you fall into the thin-content hole. Your article sounds like it solves a problem or answers a question, but instead, it just awkwardly works in the keyphrase multiple times while tip-toeing around the subject itself in the name of length and word count.

On the other hand, if you write too-short articles, you’re once again proliferating thin content. You want to answer the question of the searcher, and you also want to go into detail about it and provide as much value as possible. You want to have internal links to other articles you’ve written on the topic, as well as external references. These show Google that you’ve done your research and care about providing your readers value, and they also make it so that when you do get scraped, you get a handful of links back to your site that might one day work as referral traffic. (You would get next to no link juice from those sites.)

Take this article for instance. We hope that it ranks higher for the question in the title — “Should Publishers Still Be Scared of Duplicate Content?” — than a site that has couple paragraphs that rephrase”no, 30% of the web is duplicate content. Just don’t spam and you’ll be fine.” That’s thin content. We are trying to provide value and expand on the idea, rather than leaving it as “nah, don’t worry too much about it.”

Canonical Links and Other Ways to Do Duplicate Content Right

The thing is, we know you’re going to worry about it. At least a little. We do, too. Everyone does. That’s why we want to give you a couple of options for handling the duplicate content that you will inevitably have out there in the wild. These do, however, address only full duplication. For snippet and excerpt and incidental duplication, as long as the content itself is sound, and you’re using the boilerplate as a vehicle for quality, you will be fine.

Canonical Links

Using a canonical link tag is probably the best bet you have for keeping your duplicate content in check. While a lot goes on under the hood with a rel=’canonical’ tag, what it boils down to is you’re telling Google that whatever link you provide after it is the real deal and the one they should index.

For instance, if you have an article published at example.com/your-article, but you want to re-post that content on your own site, you’d include a tag on the reposted one that looks like this:

<link rel='canonical' href='example.com/your-article' />

Keep in mind, however, that this is a request for Google to honor, not a demand. They have reserved the right to determine which is the better source to rank based on their internal metrics and algorithms. Though Google not honoring the request is rare.

On WordPress, adding the canonical tag can be tricky, so you can use a plugin to easily do it. The aptly named Canonical SEO Content Syndication plugin works very well for this.

Content Redirection

Another way you can handle duplicate content — at least in terms of whole articles at a time — is to simply redirect the URL from one to another. If you have reposted or updated an article on your site, you don’t want the old one hanging out, vying for Google’s attention. So you throw a 301 redirect to tell Google and other engines where the new content lives, and that one gets most of the link juice passed its way.

The same applies to articles on other sites, too. If you move domains or have the post across multiple sites, you can choose the primary home by simply redirecting the duplicates. You retain link juice, and Google eventually sorts out that it’s been redirected and begins indexing the target site instead.

So…Should You Worry About Duplicate Content?

No. Not really. The chances that you will be penalized for it are minimal, and there are easy-enough ways to protect yourself if you were (canonical links being the primary defense). As long as you create content mindfully and look at why your audience is looking at your site and what answers they need, you won’t have to worry about duplicating content. Unless you’re a content scraper. But you’re not. So you’re safe.

What is your strategy regarding duplicate content on your websites?

Article featured image by Chonnajak / shutterstock.com

Premade Layouts

Check Out These Related Posts

GamiPress: An Overview and Review

GamiPress: An Overview and Review

Posted on August 19, 2019 by in WordPress

Keeping your bounce rate down is important. Making sure visitors return to your site can be equally important. With both of those metrics combined (a user engaging with your topic multiple times), the likelihood of conversion is much higher than it would be otherwise. One of the most effective (and...

View Full Post
How to Create the Best Sponsorship Page Possible

How to Create the Best Sponsorship Page Possible

Posted on July 28, 2019 by in WordPress

If you are looking for how to get sponsored, you likely rely on donations or sponsorships for funding. And making sure that you are asking the right way is integral to your success. You must make sure that you lead your sponsors to the decision to hand you their money in the best way possible....

View Full Post

12 Comments

  1. You may not need to worry about duplicate content penalties from Google, but you better be concerned that the originators of that content don’t take legal action against you. Even if you provide a link back to the originator at the end of the article stating this is the source, they may still take exception with you posting it.

  2. In regards to duplicate content, I have a client that subscribes to a program that generates a newsletter. This newsletter is also provided to other clients in the same “field”. The only thing that changes is the clients’ name and link for visitors to sign up. So the question is, this content, which is syndicated, does it count as duplicate content? It is valuable content of high quality, but it appears on many sites. What I normally do is take a short excerpt for the newsletter and post it, along with a link to the online version of the newsletter. I feel if I copy the content to the client’s site, it would cause problems.

    • In cases like that, Frank, I would do exactly what you’re doing — posting an excerpt and then link to the online version of the newsletter. While there technically wouldn’t be anything to penalize you for in terms of having that as duplicate content, given the nature of it being a newsletter, best practices tend to indicate links to the original would be the most appropriate.

      Google’s algorithms are pretty smart when it comes to differentiating between newsletters that are hosted like this, blogs that are scraped, and various news outlets posting the same content (in fact, it’s so good that it’s kind of scary). I don’t think you guys would have anything to worry about with that program at all.

  3. Thank you for this article on a topic that most everyone with a web site probably has some concerns about. Your conclusions match what I have learned after reading many, many articles on the topic. But duplicate content can take many forms andthe subject can be complex.
    So, here is my question: If I have an article on snorkeling in Hawaii, I might post it in two categories, snorkeling and Hawaii (assuming I have other articles on my site that cover other aspects of those subjects). So, there will be two URLs. website/category/snorkeling/snorkeling_in_Hawaii and /website/category/Hawaii/snorkeling_in_Hawaii.
    So, the same article ostensibly appears in two different sections of my site. What’s more, using Divi, there are links on two different pages (Hawaii page, snorkeling page) to the same blog content. Do you think that Google would see that as duplicate content?
    I’ve never gotten a definitive answer to that question.

    • Neil, I totally see what you’re talking about there. And while that would technically be two separate instances of the same content, thus duplicate (because of having two separate URLs), what I would do is pick which one you want to rank and then use rel=canonical on both pages to point to that one.

  4. Hi BJ,
    I learn something new. I was lead to believe that duplication in any form was bad. For example on a large inventory of packaging products (WooCommerce) I was instructed to fill in the the product details, keyword and snippet preview as differently as possible. You can imagine that by the 30th glass jar it starts to get difficult!

    You also mention thin content and also sites that are a mishmash. I try to keep sites to the point and don’t like waffle for the sake of it. When I am working on my own site for the fine art prints that I make, visuals take priority. In other words the pictures speak for themselves. This means a site where textual content resides on the home page, and about page, and where to purchase my work. The catalogue is built on a custom post type, and while there is provision for some words on each work, thoughts and descriptions are kept short, not to distract form the work. As I said the pictures speak for themselves.

    I see some very sophisticated sites, usually for design agencies that are heavy on the visuals and sparse on text. Not sure how far Google AI has gone with visuals as metrics for SEO and rankings? Then again such a thing could turn out to be my worst critic!

  5. OK, however what about a site or sites that have duplicate content on hundreds of their pages, when all the site does is change the suburb in the headings and in the content and the content is 90% duplicate and comes from one article? This happens a lot right now on sites. I’m ranking a client in one of these niches and one business has at least four sites that have duplicate content on all there pages and they are ranking for that content they are using backlinks as well, however, I’m not worried about the backlinks, they are spammy backlinks, however, the content is my main concern as I rank my clients sites with original articles. Would you think that would be an issue using duplicate content that way?

  6. Agree with @Allan, what rattles my cage is almost duplicate content.

    The types of sites that have almost identical content on hundreds of pages, but just change the location name.

    I’ve seen many examples for web design, plumbers, cleaners and loads more. They basically have pages and pages but add the name of a town or village.

    Take the smallest village in the UK, Fordwich. Wikipedia says it has a population of 381, yet for the search ‘plumber Fordwich’ Google has over 21,000 results. Scroll through the results and you will see that many of these pages are just duplicate content with the name Fordwich strategically mentioned.

    Google clearly hasn’t solved this issue.

    • You are very right Hedgehog and when you report the sites Google does nothing about them why does Google have a place to report sites and a browser plugin to report sites if they don’t care and do nothing?

  7. I’m still scared, why would we’re not scared of duplicate content?

    • You can imagine how scared I was when I discovered my site has two versions…the www and non-www…all contents were duplicate. COuldn’t find sleep till I sorted it out.

  8. How about when you duplicate sections in Divi and then select to use them on different media (desktop, tablet and mobile). Just got the request from SEO company a client working with where they did not like that the content was duplicated???

Join To Download Today

Pin It on Pinterest