Vox "Blogger" Copies and Pastes

Another blatant case of copyright infringement.

I use Google Alerts to find articles that might interest me. Today, while going through a list of articles that came in earlier in the week, I found an article titled “Mac OS X Vs Windows Vista.” I clicked the link and was taken to a page on Vox, yet another blog-based social networking site. The blog entry began with the following brief introduction:

Doing my daily read of the news papers today and I came across a story asking which is the better OS, Windows Vista or Apple’s OS X. me I’m a mac users so I already know which is the better OS lol. Anyhow I’m sure you don’t want to read my one sided thoughts lol.

What followed that was a sloppy paraphrasing of the entire text of an article called “Vista versus Mac OS X” on Blogger.com. The Vox “author” had obviously copied and pasted the entire piece into the Vox-hosted blog, then edited selected sentences and added paragraph breaks to come up with a lengthy summary.

For example, the original says this:

On features alone it’s easy to conclude that Vista and Mac OS X are now on par but this overlooks two important elements. Firstly, the feel of both products is very different. In my opinion Mac OS X is unobtrusive and its interface intuitive and clean. Vista on the other hand makes you work for it. Take for example another new feature for Vista called User Account Control (UAC). UAC presents an intrusive dialogue box that warns you whenever you try to make a system wide change or install a new application. This will annoy most users however and you can just switch it off. But doing so overrides all of the new security measures Microsoft have built into Vista and makes the threat of infection from viruses or malware more likely. In contrast Mac OS X generally still remains virus and malware free.

And the Vox copy says this:

ON FEATURES alone it is easy to conclude that Vista and Mac OSX are on par, but this overlooks two important elements.

First, the feel of both products is different.

In my opinion Mac OSX is unobtrusive and its interface intuitive and clean. Vista, on the other hand, makes you work for it.

Take, for example, another new feature for Vista called User Account Control (UAC).

This presents an intrusive dialogue box that warns you whenever you try to make a system-wide change or install a new application.

This will annoy most users, however, and you can just switch it off. But doing so overrides all of the new security measures Microsoft has built into Vista and makes the threat of infection from viruses or malware more likely.

In contrast, Mac OSX generally still remains virus and malware free.

This is just one example. The entire piece was used this way.

Yes, the Vox blogger did link back to the original article. But why bother going there? All of the important points were already available on Vox.

And yes, the Vox blogger did include the name of the original post’s author. But did he have permission to use the entire article? I seriously doubt it. Was this “fair use”? I don’t think so.

As a writer, copyright infringement pisses me off to no end. A writer takes time to think about and compose an original, well-thought-out work. Who knows? It may have taken the article’s author hours to write the piece. How long did it take this lazy blogger to copy and paste its text into his blog? 15 seconds?

Obviously, I reported it to Vox. And I reported it to the author of the original piece. And then I left a comment for the blogger to think about.

Maybe (lol) he just doesn’t know any better (lol). Maybe (lol) Vox will set things right and teach him a little lesson about copyrights (lol).

It’ll probably put him out of business. As the sample of his writing shown at the beginning of this entry indicates, he obviously doesn’t know how to write anything worth reading.

By the way, the original article, by Danny Gorog, is pretty good. If you’re interested in these matters, I highly recommend it. You can find it here.

May 28 Update: The copy-and-paste blogger has deleted the comment I left on his offending blog post. If he cared about writers rights, he would have deleted the entire post. I’m curious to see what Vox will do about this. Probably nothing.

Fighting Spam — All Kinds

How I deal with comment and pingback spam.

I start each morning pretty much the same way. I make myself a cup of coffee, make a scrambled egg for my parrot, and then sit down at the kitchen table and check the comments that came into my blog overnight.

About Spam

The main thing I’m checking for each morning is comment and pingback spam. These are similar but different.

  • Comment spam is a comment that exists solely to provide one or more links to another Web site, usually to promote that site or its services, but possibly to just get links to that site to improve Google rankings. Comment spam ads nothing to the site’s value. Sometimes disguised as a guest book entry or general positive comment — for example, “Great blog! I’ll be back!” accompanied by a link or two — it simply isn’t something the average blogger should want on his or her site.
  • Pingback spam is a comment that appears as a result of a link on another blog pinging your blog. Although many pingbacks are legitimate (as many comments are legitimate), there appears to be a rise in pingbacks as a result of feed scraping, which I’ve discussed here and here. Pingback spam is usually pretty easy to spot; the software that scapes the feeds isn’t very creative, so the excerpt is usually an exact quote from what’s been scraped. Sometimes, oddly enough, the quote is from the copyright notice that appears at the bottom of every feed item originating from this site. Pingbacks automate the linking of your site to someone elses — in the case of pingback spam, it’s likely to be a splogger.

Lucky me: I get both.

Tools to Fight Comment Spam

Fortunately, I use both Bad Behavior and Spam Karma 2 (many thanks again to Miraz for suggesting both of these), so the spam comments that get through their filters and are actually posted to the site are minimized. On a typical day, I might just have 3 to 5 of them. Compare that to 3,400 potential spam messages stopped by Bad Behavior in the past week and the 51,000 spam messages deleted after posting by Spam Karma in the past year since its installation. Without these two forms of protection, I’d be spending all day cleaning up spam.

Anyone who doesn’t use some kind of spam protection on a blog with open comments is, well, an idiot.

Neither program is very effective against pingback spam, although Spam Karma seems to be catching a few of them these days. Although I’m pretty sure I can set up WordPress to reject pingbacks, I like the idea of getting legitimate links from other blogs. It helps form a community. And it provides a service to my readers. For example, if I wrote an article about something and another blogger quoted my work and added his insight to it, his article might interest my readers. Having a link in my comments right to his related post is a good thing.

My Routine

So my morning routine consists of checking Spam Karma’s “Approved Comments” and marking the comments that are spam as spam. Then I go into WordPress’s Comments screen (Dashboard > Manage > Comments) and marking pingback spam as spam and deleting it.

Why do it both ways? Well, I’m concerned that if I keep telling Spam Karma that pingback spam is spam, it’ll think all pingbacks are spam. I don’t want it to do that. So I manually delete them. It only takes a minute or two, so it isn’t a big deal. If I had hundreds of these a day, I might do things differently.

The other reason I delete the pingbacks manually is because I want to check each site that’s pinging mine. I collect URLs of splogging sites and submit them periodically to Google. These sites violate Google’s Terms of Service and I’m hoping Google will either cancel their AdSense accounts or remove them from Google’s search indexing (or, preferably, both). So I send the links to Google and Google supposedly looks at them.

I’m working on a project to make creating a DMCA notice easier — almost automated — and would love to hear from anyone working on a project like that.

This morning was quiet. Only three spams to kill: one comment spam and two pingback spams. I’ll get a few more spams during the day and kill them as they arrive; WordPress notifies me via e-mail of all comments and pingbacks as they are received. (I don’t check my e-mail at the breakfast table anymore.)

Do you have a special way to deal with comment or pingback spam? Don’t keep it a secret. Leave a Comment below.

links for 2007-04-03

Google, Adsense, and Splogging

Reports of cancelled accounts while sploggers earn money by scraping honest bloggers’ content is troubling.

Jim Mitchell lost his AdSense account and Google won’t tell him why. He’s bitter about it. But what makes him more bitter is that he’s discovered that sploggers with AdSense accounts have been using his content to earn revenue.

From Is Google AdSense Really Fair? on JimMitchell.org:

Today, I found four different sites that have scraped my content to use as their own with AdSense ads on the page. This, according to the Google AdSense Terms of Service, is a huge violation. I promptly reported the abuse with hopes the sploggers who lifted my content get their income generating plug pulled pronto.

One of the commenters to Jim’s post claims his AdSense account was also cancelled for no reason.

Now I’ve had no trouble with Google or AdSense and hope I never do. My earnings are meager, but they do cover the cost of hosting, which is my primary goal for including AdSense ads on this site. (That’s one of the reasons I don’t plaster the site with advertising like so many other bloggers do.)

But I do have a serious problem with sploggers, especially if they’re using AdSense or other advertising programs to earn money by illegally using the content written by other bloggers.

I know my content is scraped. Every once in a while, I’ll get a pingback from a sloppy splogger that directs me to his site. The site is full of scraped content and not much else. Most of the ones I’ve seen seem to be link farms for some other purpose. I don’t know enough about this stuff to understand why my content is being scraped when there doesn’t appear to be ads on the site my content is appearing on. (Perhaps someone reading this can explain or include a link to a good explanation.) But if these sloppy sploggers are stealing content in a way that can be easily traced, how many other sploggers are stealing content in a way that can’t be easily traced?

And do they all have Google AdSense accounts?

Which brings up a good question: how does Google determine who qualifies for an AdSense account? Is there a human who actually looks at the sites? I seriously doubt that. So that makes me wonder how effective their software is at determining whether a site is legitimate — full of fresh, legally obtained content — or a ripoff of other bloggers’ hard work.

And that also brings up the question of the effectiveness of an Adwords account. I was using Adwords for Flying M Air in an effort to sell my multi-day excursions. While I’m no Adwords expert, I think I had it set up well. I know I was paying for a ton of hits. But I also know that my phone didn’t ring. While this might mean that people don’t want the service I’m offering — chances are, they get sticker shock out when they see the price — it also might mean that the clicks aren’t being made by serious customers — or even by humans.

But it also means that my Adwords payments might be going to sploggers who have built sites to draw in visitors who then click on my link. I probably wouldn’t mind so much if they were buying — one sale would pay my Adwords bill for a year — but they’re not. So I could be paying, through my Adwords account, for sploggers to steal content from honest bloggers, some of whom, according to Jim Mitchell, have had their AdSense accounts yanked for reasons never explained.

I guess what I want to know is this:

  • Why does Google cancel the AdSense accounts for certain bloggers who claim they have done nothing wrong, then refuse to explain why they were cancelled?
  • How does Google ensure that AdSense accounts are given only to legitimate sites — and not to sploggers or other copyright violators?
  • How can Google Adwords customers be assured that their ads are appearing on legitimate sites and are being clicked by humans who are genuinely interested in the products or services advertised?

I hope Jim gets his AdSense account back. And I hope that other bloggers do their best to report feed scraping and splogging activities to Google or other ad sourcers whenever it’s found.

Being a Responsible Blogger

With regular readers comes responsibility.

This morning, I noted that the feed for this blog has exceeded 100 subscribers. The 100 mark is a milestone for any blogger, and it’s no different for me — even though I’ve been at it for some time now.

I’ve been blogging for over three years and my blog doesn’t exactly follow all of the “rules” of blogging. I’m talking about the “stick to one topic” rule and “blog multiple times a day” rule. People say rules are meant to be broken, but that’s not why I break these rules. I just blog the way I want to blog and don’t really pay attention to the rules.

My Original Blog as a Separate Entity

My blog started out as a separate entity from my personal Web site, a way to share whatever I was thinking about or doing with people who might be interested. It was a personal journal, slightly filtered for the public. It was a way for me to record my life so I’d have something to look back on in the distant future. I didn’t care if anyone read it and was often surprised when someone I knew commented about something I’d written in my blog.

Back in those days, my blog wasn’t something I worked hard at; the entries just came out of me, like one-sided conversations with friends. Perhaps it has something to do with my solitary work habits — many people gather around the “water cooler” at work to trade stories about their weekends or opinions about world affairs. There’s no water cooler in my office and no co-workers to chat with. My blog may have been my outlet for all these pent-up stories.

Blog + Site = ?

A little over a year ago, I combined my blog with my personal Web site. I did it to make my life a little easier. I’d already decided to use WordPress as my Web site building tool. Why not just make my personal blog part of the site?

My Web site has been around in one form or another since 1994. I built it to experiment with Web publishing and soon expanded it to provide a sort of online résumé and support for my books. Support for my books often meant additional tips and longer articles about some of the software I’ve written about. This is fresh content of interest to people who use that software, even if they don’t buy or read my corresponding books. Since writing this content is relatively easy for me, I have no problem offering it free to anyone who wants it (as long as they don’t steal it and pass it off as their own; see my © page).

One of the great things about blogging software is that it automatically displays the newest content on the Home page and archives older content by category and date. In the old days, I’d have to manually create new pages for every article I wanted to put on my Web site and then add links to them. It was time consuming, to say the least. Sometimes too time consuming to share even the quickest little tip with visitors. So I didn’t publish very many articles. But the time-consuming, hand-coding aspect of my site is gone, and it takes just minutes to put any content online, whether it’s a link to an interesting podcast I just listened to about iPod microphones or a multi-part series of articles explaining how to use WordPress as a content management system.

What’s odd about the merging of the two sites is that my personal blog entries now commingle on the Home page with my book support entries. So these 100+ subscribers are seeing (and possibly reading) all kinds of stuff coming out of my head. (Now that’s a scary thought!)

My Responsibility

As my blog/site audience grows, my responsibility to provide good content for readers also grows.

The way I see it, when only a half-dozen people read my blog regularly, it was okay to bore them with stories about my horse eating corn cob stuff out of the bottom of my bird’s cage or rants about the quality of “news” coverage. Now, with over 100 regular readers, I need to think more about what would interest my audience and concentrate on producing the articles they want to read. (You can help me by voting on this poll.)

And that’s when blogging becomes work. That isn’t necessarily a bad thing, but it does take more effort on my part.

And it may push me far from the original purpose of my blog: a journal of my life. That’s something to think about, too.

The Other Blogs

I just want to take a moment here to comment on some of the other blogs I’ve seen out there. The vast majority of them are a complete waste of bandwidth. Some exist to echo the sentiments of others and show very little original thought. Others are complete blather, written in a style that makes me mourn for the failure of our educational system. Like chat room comments. Ugh. I don’t see why people waste their time writing this crap and really can’t see why people waste their time reading it.

But there is a small percentage of blogs that provide good, informative, or at least interesting content, written in a way that’s easy to read and understand. Those are the blogs that serious bloggers should be reading and learning from. Those are the blogs we should try to emulate, not by simply copying or linking to content, but by adding our own original material to the blogoshere.

That’s my goal and my responsibility as a blogger. If you’re a blogger, is it yours, too?