A Threat to Your WordPress Blog: Duplicate Content
Full version of the article including tutorial can be found here: Making Your WordPress Blog Duplicate Content Safe
Blogging is extremely popular these days. And the most popular stand-alone blog engine is WordPress. It is flexible, has many useful featires and there is a lot of eye-catching templates for it. But those who have a WordPress blog must be aware of a serious problem that can cause your blog to be removed from Google’s search results. The problem is: Duplicate Content.
WordPress content management system which, when used with the default configuration, is not duplicate content proof. In fact this CMS is capable to render almost 100% of your content duplicate. As usual the fault of the system has roots in its advantages. WordPress has many features facilitating blogging and linking, such as RSS feeds to posts and comments, trackback URLs, monthly archives and so on. In the same time this variety of URLs returning similar or identical pages represents a clear case of duplicate content.
WordPress And Duplicate Content
The first evidences of duplicate content produced by your WordPress CMS can be found in your sidebar. They are category pages and monthly/daily archives. Category pages store your articles posted under the same topic — a category. Such pages have no unique content; they are just a collection of your previous posts. Monthly and daily archives also simply group your previous articles by the date of posting. Sometimes when you have only one post in a given day, the archive page for the date and your post are totally identical.
The next case of duplicate content is even more prominent. It can be your home page itself. If it contains not excerpts but the full text of your posts, then it duplicates your post pages. This also applies to the “next/previous entries” pages — those accessible via /page/2, /3, /4 etc.
Feeds. Search engine spiders crawl all the content they can reach and of course this includes RSS feeds too. The additional problem with them is that Google may choose to display your RSS URL in the search results over the link to the original post. In this case the user who clicks this result will see an XML formatted page which is not “human-friendly”.
Trackback URLs. Many WordPress templates add trackback links after posts. This links enable authors to track who links to their posts. Usually, if your post URL looks like “www.yoursite.com/2006-11-30/yourpost/” its trackback URL will be “www.yoursite.com/2006-11-30/yourpost/trackback/“.
Identical meta-description. By default WordPress doesn”t provide a tool to add unique meta description tags to your posts, and they either have none or share a single site-wide description. Having no meta description at all is a disadvantage, as a properly written one can make your snippet stand out in a SERP. Having an identical description for all your pages is a threat, as Google might get them filtered out as too similar.
Because of the duplicate content Google search can return less desired URLs (such as feeds or archives instead of original posts); your pages can be moved out of their index, or placed into the supplemental results, which are rarely displayed to users.
For tips how to get rid of the duplicate content in WordPress please refer to my tutorial: WordPress vs. Duplicate Content