When I complain about plagiarism, I hear the same thing over and over from other bloggers: “Nobody ever plagiarizes me. I guess I’m not that important.”
Really? So you’ve been checking to see if your stuff has been copied?
Exactly. It’s time to find out how.
Watch Your Trackbacks and Incoming Links
Odds are your blog posts will include links back to your own site at some point, like when you refer to your other posts. The quickest way to stay on top of this is to glance at the “Incoming Links” module in your WordPress dashboard:
In that screenshot, I can see that Steve Jones, Tom LaRock, Stacia Misner, Ted Kreuger, and “unknown” have all linked to my site recently. By glancing at that list, I can see that most of those are completely okay, but the “unknown” one gives me pause, so I’d click on that to make sure it’s a legit blog. On a side note, you should always monitor these anyway, click on all of the links, and read what people are saying about you.
Another built-in WordPress tool is the list of pingbacks. When people copy your work verbatim and publish it, their blog may try to send a pingback link alerting you. Go into your Comments list and filter it by pings only:
In that screenshot, I can see that Sean Gallardy has linked to my SQL Server checklist. I would want to click on that link to make sure it’s not an exact word-for-word copy of my own checklist, or another one of my blog posts that happened to link to my own checklist.
Set Up Free Google Alerts
Even if the plagiarist is smart enough to disable pingbacks, they probably won’t strip the links out of your blog posts. To catch those, I set up Google Alerts for real-time notifications; whenever Google runs across the word “BrentOzar.com” anywhere on the web, they send me an email. I can tell at a glance if it’s a plagiarized post, a forum question pointing to one of my articles, or a blog comment. I’ve set up similar alerts for sites I manage, my name, companies I work for, and so on.
When I’ve built a blog post I’m particularly proud of, I even set up Google Alerts for key phrases in the post. For example, in my SQL Server 2008 DAC Pack blog post, I used the phrase “Bringing Sexy DAC.” I can be fairly certain that phrase will not come up often, and if it shows up on the intertubes, somebody’s stealin’ my work. That phrase is a little down the page, beyond the first paragraph, so it shouldn’t show up if someone’s only showing the first few sentences of my post (which would be okay.) I set up a Google Alert for that, and if anybody is automatically reposting my work, I get notified.
(Yes, I’ve deleted that Google Alert now because I know by saying this, I’m going to get a bunch of tweets saying “I’m Bringing Sexy DAC!” Heh. I love you people.)
Monitor Your Referrers
If you’re using web analytics tracking to see how (un)popular your site is, it probably has a screen to show which sites are linking to you. In my favorite free web analytics tool, Google Analytics, it’s under Traffic Sources, Referring Sites:
Because the plagiarist may not be popular yet, you need to go through ALL of the referring sites, not just the top ten. The more popular you get, the more painful this gets, but on the plus side, you get a warm, fuzzy feeling seeing everybody linking to you in a good way.
I go through this list looking for sites I don’t recognize, then I drill into the analytics to find out exactly where in the site they’re coming from, and I click on it. Hopefully it’s not an exact copy of one of my posts that links to another one of my posts.
Use Tynt.com to Tweak Copy/Paste
This has to be one of the coolest tools I’ve ever seen. The easiest way to understand how it works is to see it in action. Go to any page on BrentOzar.com, select some text, copy it, and then paste it into a text editor:
SHAZAM. It doesn’t get much more obvious than that. I used to use more polite wording, but after being repeatedly plagiarized, I’m going with the big guns now.
Tynt even gives you a slick dashboard to show where your content is being pasted:
Search Manually with Copyscape.com
Finally, every now and then I go searching for copies of my recent posts with Copyscape.com. I put in a URL to a recent post (30-60 days old), and Copyscape goes hunting for similar copies. Their logic is pretty fuzzy, and it gives me a lot more misses than hits, but when it hits, it hits big time. It catches plagiarists who are smart enough to disable trackbacks, strip out your links, and even futz with your wording to try to make it look different.
This is how I caught CrazySQL initially, and how I found that BugoSQL was trying to hide some of my posts in disguised PDF files.
It’s a lot of work catching these diabolical bastards, and it’s like a never-ending game of Whack-a-Mole. I have to keep playing because I make a living off my content – it’s my marketing tool to bring in new consulting customers. This is especially important to me now that I’ve become a full time consultant; I don’t get paid unless I’m working for a client. I’m not getting paid to write this, either, but I do it because I’m passionate about helping the community and helping bloggers protect their content.
Very useful information there Brent.
Tynt doesn’t work when you copy/paste from your email either. I get emails when you put out a new article and that means I don’t normally read your new articles on your site.
I don’t write blogs, so I don’t know how hard/time consuming this would be, but maybe you could write and abstract that goes to people’s emails and then link to the full article. That would limit the information that is going out of your security measures.
Patrick – thanks, sir!
Michael – no, I don’t know if Tynt works with RSS, but I would doubt it. Probably depends on the reader.
Always learning something new here Mr. Ozar. Thanks for the tip. Trying tynt now.
I’m glad all the hard work tracking these evil plagiarists down is paying off.
Thanks too for showing us the tools everyday bloggers can use to protect their work and ideas, there is nothing worse than having somebody pass off your own hard work as their own.
Very useful post. I have been using Tynt for sometime and it is really good.
I could probably look in the fine manual, but off the top of your head, do you know if tynt works with RSS feeds? Or with syndicated columns (e.g. inside SQLServerPedia)
Execellent post. I always enjoy reading your stuff. Thank you for this information.
See! that’s why I don’t blog, LOL
Thanks for the very useful info.
Wow. Lots of good info here, I had no idea.
I hooked up my WordPress 3.0 blog with the Tynt plugin using your copyright idea, and started monitoring with Google Analytics.
I found one site that seems to copy my (and other people’s) entire posts and just add a header link to the original. Unsure what to do: oakleafblog.blogspot.com
You did some great work here, Brent. Thanks.
Mike – ouch. I used to read Oakleaf a lot when I was digging into cloud stuff, and I get the feeling they’re trying to do the right thing. I didn’t know they’d switched to publishing whole posts. I would recommend emailing him and asking him not to include your entire post.
This weeks perusing of your blog has been very interesting and eye-opening for me. I now appreciate how much the activity you’re railing against affects you (and me!). As one who has used the information you post to great and very positive effect in my work life I want you to keep doing what you’re doing! You’re also very witty and I do enjoy your attempts at humor. Bringing Sexy DAC….wow that was a good one 🙂
Will you be looking over your shoulder for Justin Timberlake to accuse you of plagiarism now ? ;-P
I bow to thee!
– an ardent devotee
BugoSQL has copied a couple of Powershell disk space scripts. The top one of which is mine!
The “Copy Right BugoSQL.Net” notice at the bottom made me chuckle.
Hi Brent –
I don’t think I’ve seen you comment on fair use copying of your content. I use a rule of 10% or less for quoting in a blog post. Have you discussed what you think is fair for a single post?
Karen – good question. In the footer of my site, I’ve got a link to my page on how to use material from my blog:
I’m okay with as much as 25% of my post, but only because I tend to write shorter posts. If I consistently wrote longer stuff, I’d crank that back to 10%.
Hmmm, I have just seen a blog by someone who got quite annoyed when they got extra text appended to the stuff they copied. I suppose it can be quite a shock if people are pasting handy code for their own use.
However people have the right to protect their own stuff- I might just tone down the message a bit myself, if I start using Tynt.