Copying, plagiarism, and how to avoid duplication and copyright infringement

I recently had a prospective customer contact me and ask about my writing services. His organization wants to create some content and they have a few pieces from competitors that they really like. This prospect wondered if I could do something similar for his company but was quick to clarify that he didn’t want me copying the competitor’s work.

This is a common request among people hiring writers and copying/duplication/copyright is also a huge issue in the writing and marketing industry. And it’s not just an issue for legal (i.e. copyright infringement) reasons but also for search engine optimization reasons — search engines prefer original content to duplicate content.

In this blog post, I’m going to give you my thoughts and ideas but let me just make this disclaimer up-front: This blog post does not constitute legal advice. If you have any concerns about copyright and copyright infringement, you should talk to an attorney. This is by no means comprehensive or exhaustively authoritative. Instead, I’m simply writing about some of my observations and rules of thumb I’ve developed over the years, and I’m trying to portray the spectrum of the issue — both good and bad. I do not endorse all of the information I am writing about here!

Let’s start with a scenario: A customer has a document (we’ll call it a “source document”) and they want their own document (an “end document”). That document could be an ebook or a website or an article or whatever. It doesn’t matter. And they may do the work themselves or with a staff writer or a freelancer or whatever. Again, that doesn’t matter. The key point I want to cover in this scenario is how the source document influences the end document.

When you think of a document, it’s not just a collection of words. It has scope, tone, appearance, a message, and it is written for a specific audience. The more of those qualities that your source document and your end document share, the more likely you are that you are risking duplicate content and copyright infringement. (Again, that is just my own opinion, drawn from nearly 20 years as a writer, but the word choice, scope, tone, appearance, message, and audience characteristics have served me well as markers of content originality).

So let’s look at the ways that the source document could possibly influence the end document…

WORD-FOR-WORD COPY

This is where someone basically opens the source document, clicks Control+A to highlight all of text, then copies the text and pastes it in the end document. Then they sign their own name as the author.

This is plagiarism and it’s illegal. There is very little argument here. The word choice, scope, tone, appearance, message, and audience are identical.

Unfortunately, it happens a lot because businesses produce so much content and it can be hard to police the issue (or, once you’ve found a culprit, to do anything about it). Trying to put myself in the shoes of a plagiarizer for a moment, I’m sure it’s tempting to copy and paste when you find something great, and especially when you compare the price of hiring a writer to write something original versus just spending 30 seconds to grab the content yourself. Plus, we live in an era where there is a lot of free stuff online anyway so the rules can seem a little blurry. I’m definitely not condoning it as a practice!

You can use tools like Copyscape to help you identify some of the times when it happens but no solution is perfect.

As someone who writes for clients for a living, and adheres to strict standards of originality, it drives me absolutely nuts that plagiarism is even an issue. I hate looking at requests for proposals from prospective clients and seeing “we will check your work against Copyscape”. I understand why they put that in their RFP and I hate that they have to do it. And I hate that there are people out there who call themselves writers but really only know how to press Control+A, Control+C, and Control+V. (Rant over)

WRAPPERS AND QUOTATIONS

This one has some good qualities to it and some bad qualities to it. So first I’ll describe it and then tell you waht I think about it: Let’s consider our original scenario again — a business that has a source document and wants an end document. One way they can draw from the source document for their end document is to copy some of the source document content, paste it into their end document, and then “wrap” that content with original content.

If you ever wrote a paper in school, you probably did just that: You wrote your own content and then backed it up with research that you quoted from others. In essence, your content copying was “wrapped” with your own original content. But it happens outside of academia as well. I see it in blogs a lot — where a business will use some info grabbed from somewhere else and then write their own introduction and conclusion.

Whether this is a good practice or a bad practice depends on a couple of things:

  • Attribution: When I was studying for my MBA, we had to review the papers of one of our classmates and noticed that the tone and word choice in the paper switched back and forth a lot. So I did a bit of research and found that his source document was written by someone else… unfortunately, instead of quoting from the source document and attributing it appropriately, he tried to pass it off as his own work. (I reported him and he vanished from class — no big loss.)
  • Amount of content: In the copyright and disclaimer sections of some books, they will sometimes list the amount of content you can copy and if they do that, they’ll often give you the reasons when you’re allowed to copy. When it’s not clear how much content you’re allowed to copy, there may very well be laws that dictate but I’ve never known what they are. In general, though, it’s hard to go wrong with some smallish quotes that, again, are properly attributed.
  • Access: This one gets overlooked a lot but I think it’s important. I think the amount of access that people have to a specific piece of content can also determine how much you can copy. If you are quoting from a book that is for sale, you shouldn’t use too much of it. If you are quoting from a popular article that is posted online, you may be able to use more. (Again, always attribute appropriately and check copyright restrictions).

The word choice, scope, tone, appearance, message, and audience are going to be mostly different for your original content and obviously the same for the copied content.

I think the key idea here is whether or not you are passing stuff off as your own or revealing a ton of stuff contained in a source document that most people have to pay for.

IDEA-FOR-IDEA COPY

Word-for-word copy is bad. No question. One way to circumvent the copyright problem is the slightly cloudier method of copying idea-for-idea. I see this in a lot of requests for proposal by people who want a end document that is almost identical to the source document but want to avoid the legal hassles of copyright infringement because you can’t copyright ideas, you can only copyright how those ideas are expressed (i.e. the words).

Idea-for-idea copy is where you simply restate the idea of the source document so that the information remains almost exactly the same but the words are different. This can be done at various levels — you can do it at the word level, at the sentence level, at the paragraph level, or at the section level.

At the word level: Let’s say your source document has a sentence like “financial representative” and you just search and replace any mention of financial representative with “investment advisor”. You find all the keywords and simply swap them out for synonyms.

At the sentence level: You just rewrite a sentence so the same information is communicated with different words. For example: “Buying your first home can be hard” can be switched to “It can be difficult to purchase a home if you have never done so before.”

… it’s the same no matter what “granularity” you use — whether restating sentences, paragraphs, or entire sections of a document.

Is this plagiarism? Well now it’s getting murky. Swapping out a couple of words for other words really is plagiarism even with the minor changes (and even if you don’t use all of the content). Remember: Word choice, scope, tone, appearance, message, and audience are going to be very similar.

It gets more complicated the more you change. At some point (and frankly I’m not sure what that point is), you move out of the realm of plagiarism and into the point where it is legal.

But is it ethical?

I’m not convinced.

Sure, your appearance might be slightly different (since you’re using different words and perhaps using different graphics and images) but everything else is nearly the same. Swapping out ideas for synonymous ideas doesn’t automatically change the scope or the tone. The message doesn’t change. And the audience hasn’t changed either.

The more granular your synonym swapping, the easier it is to spot. For example, if you are only swapping out the keywords for synonyms, it’s much easier to spot because many of the connecting words will still be original. If you are restating larger portions (like paragraphs), it’s harder to use technology to find it but someone can probably do a visual side-by-side comparison. It’s still plagiarism because you are still stealing the fundamental concepts of the original document even if you are changing the text.

SOURCE DOCUMENTS AS RESEARCH

Another way that you can use source documents is to use them as research. That is, you review the content from your source document, along with other content, and you create something original. Yes, your end document probably covers some of the similar pieces that your source document covered but the information is yours.

When you use source documents as research, you have control over the word choice scope, tone, appearance, message, and audience and you can make adjustments to those things as you go.

In my opinion, this is the very best way to create content and it is the way that I am paid by my clients to draw from source material. (Occasionally I will use wrappers and especially quotations but that’s really not how I get paid). It’s the way that causes the least number of headaches and worries — you won’t be kept up at night wondering if an angry attorney will waiting to slap a lawsuit on you.

As much as possible, I urge you to use your source documents in this way.

SOME FINAL THOUGHTS

Want to know why Biblical quotes in television shows or movies are almost always from the (very dated) King James Version of the Bible? It’s because that document is copyright free while most other Bibles are copyrighted translations owned by a publisher who will require permission before it can be used. So copyright free documents (and a relatively new concept of “uncopyrighted” documents) make this even more challenging. There is also a “Creative Commons” licensing movement that is growing to create new guidelines around how to use different kinds of content. The issue gets even trickier when you consider PLR (Private Label Rights) content that you can purchase that allows you to use the source material straight-up or with a specified amount of changes. It’s so complicated!

This is a spectrum — on the one side you have blatant copying; on the other side you have pure, original work. One is clearly wrong; the other is clearly right. But those aren’t your only two options. In the middle, it’s harder to navigate the murky waters. In my opinion, you can’t go wrong creating your own content and, when appropriate, quoting your source document and accurately attributing the quotes.