Tuesday, January 22, 2013

How to Protect Yourself from Duplicate Content

Duplicate content happens when the same or slightly different content (text and/or image) appears on more than one web page on the Internet. It can be internal, when it appears within one single website (different pages have the same content, or same article has several different URLs), or external, when other websites copy or steal your original content, with or without your consent.

The duplicate content can negatively affect the experience when the popular search engines’ users are looking for certain information on the Internet. They won’t be satisfied if instead of interesting articles they’ll find useful, the results are full with links to different websites but with the same or similar information. In order to provide the best possible service, the search engines are constantly working on improving their algorithms to help them better detect duplicates on the Web, and even though they won’t directly ban plagiarist websites, they’ll certainly penalize such activities by lowering their page rankings and overall website authority. Here’s how you can protect your content:

Copyright your content. One of the first steps in preventing others from stealing your website original content is letting them know that your work is under copyright protection and must be properly cited if used. You can add a simple statement on the web page footer, but it would be better if you are clearer and up front, using a disclaimer, or adding a “Protectedby …” badge, like those available on various duplicate content detection sites.

Perform internal audits. Since duplicate content can also appear within the same website, intentionally or not, a great way to make sure your website is properly optimized is to internally audit it for duplicates. It often happens that the category pages and the full article pages contain very similar, if not the same information, and this can also be true for product description pages that differ only in few product characteristics. Before you start fighting the external content scrappers, make sure your website is safe from within. Use robots.txt to manually select what pages should be indexed, and Google Webmaster Tools or Yahoo! Site Explorer to stay up-to-date with any additional problems.

Actively check and monitor for duplicates. The second your new article is published on your website, particularly if it is an authoritative website with unique and creative articles people want to read and share, plenty of other blogs and websites will try copy or rework it so they can republish it as their own original content. Because of this, it is essential for you to constantly check and monitor the popular search engines results for duplicates that may harm your website’s SEO efforts. Look up Google’s Blog or set Google Alerts for unique strings of your original text, or check and monitor for copies of your web page URLs on the Internet using some of the available tools and software like PlagSpotter or CopyScape.

Tuesday, January 15, 2013

What is Duplicate Content and How it Affects SEO

Duplicate content is a major issue on the Internet today, a problem most websites are faced with. Every time the same or very similar content appears on more than one place (web page, URL) on the Internet, on the same or on different website, a duplicate is being created. These duplicates ruin the user experience when searching the Web for information. When looking up certain keywords, the users are only interested in relevant and useful links – they don’t want to see a list of websites with the same text reappearing over and over. Since the search engines are there to provide the best service for their users, it’s logical that the duplicate content issue will negatively affect the website SEO.

One of the most common ways duplicate content is being created on the Internet is by copying and pasting other websites original content, or taking parts of several different texts and slightly reworking them to be presented as unique and original. If you submit your own article on several different websites, you’ll also create duplicates. In addition to this, duplicate content can also result from applying poor web development techniques, developing bad link structure, and making bad SEO decisions.

As mentioned before, the search engines want to deliver quality results to their users, and they want to do that fast so their users will be more satisfied. But due to the complexity of the used algorithms, crawling, analyzing and indexing all those pages takes time and valuable resources, and when you add the duplicates it becomes very clear why Google and other popular search engines have to transfer the pressure on the webmasters and SEOs so that they’ll be motivated to eliminate their duplicate pages.

So how does duplicate content affect SEO?

Google and other search engines probably won’t ban your website just because of few duplicates, but there are other ways the duplicate content issue will affect your website SEO. When the search engines identify duplicates, they’ll only list the original page in their search results, and the decision which content is the original will depend on the age of the page, the PageRank, the website authority, the number of incoming links, etc. This means that your website won’t be listed in their results if you are publishing copy-pasted texts, and since the purpose of SEO is getting listed and ranked higher in the result pages, the negative effect of the duplicate content on SEO is more than obvious.

The duplicates also negatively influence the search engine PageRank distribution, because lots of the links are getting diffused. Lower page rank means lower website ranking and bad SEO. In addition to this, the search engines find and index only a particular number of pages from every website, depending on the authority of the website, and if there are lots of duplicates the additional pages you publish on your website will be indexed much slower.

The duplicate content has the same negative effects even on commercial websites SEO. They often have several different pages for the same products, differing only by a certain characteristic, and if the descriptions are exactly the same or just slightly different on all the product pages, they are creating duplicates that will also hurt their SEO efforts.

Wednesday, January 9, 2013

Plagiarism Detection: PlagScan vs. PlagSpotter

Plagiarism, or stealing other people’s language and ideas, is a known problem that exists since forever, but the development of the Internet in the last two decades gave this problem a completely new dimension, providing quick access to thousands of free and available to everyone resources on any given subject. Instead of learning, developing critical thinking and creating greater value, many students and authors are now choosing to copy, or often just slightly rework others original works mixing several different sources. Plagiarism is a fact of today – the question is how to stay protected.

There are different tools and software people use these days to detect plagiarism and protect their original work. School and university professors can, for example, use the PlagScan professional and academic plagiarism detection service to analyze their students’ essays, thesis and dissertations and see if parts of their work have been plagiarized. They can copy and paste the text, upload a whole paper or add as many documents as they need, and PlagScan will check the content and deliver a detailed PDF, plain text and docx formatted report via email that shows the percentage of duplicated work, the sources of the original content, and highlighted potentially plagiarized phrases so they can quickly assess, during the regular proofreading, whether the particular match is plagiarism or just an acceptable citation.

Webmasters who often buy content for their websites and want to make sure their authors are delivering only unique writings can also benefit from this plagiarism detection software. There are different types of accounts – single and power user, depending on the volume of checked documents, and also a special version for registered organizations, where each user has an individual sub-account through which he/she can directly upload the documents to the organization’s user account.

PlagScan is a paid service, with an internal credit point system, but the first-time users can also register for a free test account if they want to check the offered quality. Since PlagScan is certain their users will be satisfied with their product, they additionally offer a full refund, within two weeks after the initial purchase.

PlagScan checks the uploaded content comparing it to the user’s own database and its own global documents, and for the Web documents they use the search index of Yahoo. But authors who write for the Internet, bloggers and webmasters who want to protect their website content from stealing and to automatically monitor their web pages for plagiarism can better benefit from the PlagSpotter duplicate content checking and monitoring tool.

This online tool will instantly scan the entered URLs for duplicate content on the Internet, and provide information of the percentage of matched content, together with a list of the external URLs that contain the matching text. Based on this data SEOs and bloggers can undertake the needed actions to inform the popular search engines of the detected plagiarism, which can penalize and even block the stealing websites if they don’t remove the duplicates.

There’s a free version of PlagSpotter where you are only allowed to enter one URL at a time, and different paid packages that let you automatically monitor the URLs you’ve selected for duplicate content, on a daily or weekly basis, and get email notifications for the % of duplication. For additional protection, web owners can embed the "Protected by PlagSpotter" badge on their website to warn others from stealing their content. The paid plans also have a free 7 day trial offer.

Monday, December 24, 2012

What Content Can Be Plagiarized?

The word plagiarism exists since 1620s. It was always considered bad to claim someone else’s work as your own, no matter the form it’s in, whether it is written, visual, or conceptual. But these days, it seems harder for people to understand what plagiarism is and how it happens, just because large part of the creative work is now being published on the Internet. If it is a different medium it doesn’t mean the rules are changed. Those who plagiarize try to explain how they just ‘borrowed’ or ‘copied’ a small part of the text or an image, but in most cases they won’t admit they’ve done wrong. In order to prevent plagiarism on the Web, it is important to clarify what is plagiarism and how to avoid plagiarizing other people’s work.

How does Plagiarism Happen?

You are plagiarizing every time you are closely imitating other people’s ideas or words, using them as your own without having their permission; or when you present an idea or work as a new one, but it is actually taken from another source; or when you don’t credit the original source. If you copy content from other website, and publish it on your own as your work, without crediting the source or providing information about it, and even if you change the words a bit, you are not only causing damage to the quality of the content on the Internet, but you are also doing actions that are considered plagiarism. There are websites that would have so many words or ideas from other sources that they make up the majority of their content, and it doesn’t even matter if they credit those sources or not. That’s plagiarism also.
Almost everything on the Internet can be plagiarized, intentionally or not, and most people aren’t completely sure whether their actions are considered plagiarism, particularly when sharing content on the popular social networks, or downloading music, movies or games from the torrent type websites. It’s getting confusing for the young generations too, who are having difficulties to differentiate what is sharing and what is plagiarizing.

How to Avoid Plagiarism?

You can avoid plagiarizing other’s people work if you familiarize yourself with the terms of fair dealing/fair use and copyrights, and if you always assume the copyright is in place, if not noted otherwise by the author. To make sure you are always on the safe side of copyrights, limit your longer quotes to maximum 250 words and get permission from the author if you want to republish a larger portion. It is also very important to always properly cite the sources to accredit their offline or online works, linking directly to the sources. In addition, you can ensure that the content you use on your website is completely unique and original by scanning it with different plagiarism checking tools, like PlagTracker, PlagSpotter or CopyScape.

Thursday, December 20, 2012

The Consequences of Online Plagiarism

How successful will your online marketing be greatly depends on the content that you publish on your blog or website. In its efforts to provide relevant and quality links to the users, Google is improving its algorithms to better notice and penalize plagiarized or duplicate content on the Web, that is, same content that appears on more than one web page on the Internet, and even on the same website. If you want to keep your website listed in the results of the world’s most popular search engine, you better not steal content from any website, no matter how harmless that might seem to you, like to copy and paste a small part from a source like Wikipedia to fill a section on your website about the area you live in.

What is Plagiarism on the Internet?

Online plagiarism, or content scraping, refers to copying the same text and/or images from a certain source, without properly citing them. These methods for building search index-able content have been extensively used in the past, but today they are not only inapplicable, but can also have serious legal and search rank consequences for the web owners.

Legal Outcomes: There are many tools on the market today designed to identify content scraping. If the content they detect is plagiarized or isn’t properly cited, the web owners can file a Digital Millennium Copyright Act (DMCA) removal notice, and if the plagiarists don’t remove the scraped content, they can then file a copyright infringement lawsuit. As you can see, there are serious legal consequences to online plagiarism.

Search Rank Outcomes: When the search engines discover duplicate content on a certain website they will lower its rank in their result pages, and if the content scraping is severe enough, they’ll even remove the website from their results. The Panda update is Google’s way of saying No! to scrapping websites with low quality content and a high ad-to-content ratio, and Bing and Yahoo are also against and fighting plagiarism. Without these search engines your website has no real chance of seeing the light of the day. An article from some popular website might be an interesting source for your readers, but won’t be of much help to you if thanks to it your website drops of Google, Yahoo and/or Bing. Still, despite the high risk of penalization and even removal from the search engine results, there are many website owners that choose to plagiarize content. This further highlights how important it is that we protect our original content.

How to Protect your Content?

There are many different tools that webmasters can use to detect potential plagiarism, like the plagiarism checker PlagTracker, the duplicate content checking and monitoring tool PlagSpotter, or CopyScape, etc. When they are certain their content has been plagiarized they can file a DMCA complaint and ask for the duplicate content to be removed.

Sunday, October 19, 2008

Plagiarisim Today's review of CopyrightSpot

Jonathan Bailey from PlagiarismToday recently wrote a review of CopyrightSpot. Jonathan is an expert when it comes to methods and tools to protect your original content online and he is always insightful and thorough with his reviews. As he stated in his review:
I put the service (CopyrightSpot) to the test to see how well it performs against Google and some of its competitors. The results were surprising.
Even thou CopyrightSpot is still in alpha and has more work to do to create one of the top online content protection tools, Jonathan didn't cut it any slack during his testing. This is what he said:
Since CopyrightSpot is in an alpha release, I feel the need to go very easy on it. However, since many have already started relying upon it as their primary copy detection tool, I did want to put it through a few tests to see how it stacked up against both Copyscape and Google itself.
We're glad he didn't take it easy on testing CopyrightSpot and his review has provided a great resource for us to use to help shape CopyrightSpot. We've already begun to incorporate some of his suggestions. We'll make an announcement here when they're ready to use.

It's exciting that a solid base for plagiarism detection has been built into CopyrightSpot and attention can now be used to tweak and enhance the detection algorithm, along with building out more tools to help bloggers, authors and poets with protecting and managing their online original content.

Copyright News 2008-10-19

These are links that I find useful and insightful around Copyright and the web: