Using Creative Commons to Stop Scraping

An excellent article on PlagiarismToday.

As a blogger, feed scraping is one of my pet peeves. It irks me to no end that sploggers use automated tools to copy my copyrighted content from my site to sites that exist solely to attract clicks on AdSense and other ads.

Jonathan Bailey likely feels the same way. He writes about the topic regularly in his blog, providing well-researched and insightful commentary to help understand and fight the problem.

His recent article, “Using Creative Commons to Stop Scraping” on PlagiarismToday:

Many sites, including this one , have expressed concerns that CC licenses may be encouraging or enabling scraping.

The problem seems to be straightforward. If a blog licenses all of their content under a CC license, then a scraper that follows the terms of said license is just as protected as a human copying one or two works….

However, after talking with Mike Linksvayer, the Vice President of Creative Commons, I’m relieved to say that is not the case. CC licenses have several built-in mechanisms that can prevent such abuse.

In fact, when one looks at the future of RSS, it is quite possible that using a CC license might provide better protection than using no license at all.

The article then goes on to explain what a Creative Commons license is and what it requires of the licensee. As Jonathan explains, the automation tools that sploggers use simply cannot meet all of the requirements of a CC license, thus putting the sploggers in clear violation of the license terms.

If you’ve been wondering about copyright as it applies to your blog or Web site, be sure to check out this article. While you’re at PlagiarismToday, poke around a bit. I think you’ll find plenty of other good material to help you understand copyright and what you can do when your rights are violated.

Digg and the HD DVD Key

A few thoughts about the recent goings on at Digg and elsewhere.

Last week, the hexadecimal key code that is used for copy protection on HD DVDs appeared in a blog. The key code is a 16-digit string of two-digit numbers and letters — if you spend more than an hour a day on the Web, you must have seen it by now. I won’t repeat it here because, frankly, I don’t have to. It’s easy enough to find online. Just Google HD DVD Key.

And that brings up the main point of this post: the so-called Steisand Effect. In 2002, Barbra Streisand sued a photographer who included a photo of her Malibu estate on the Web. He was doing an aerial photography research project about coastal erosion and the photo was one of hundreds of others that were published on the Web. In the publicity that followed, the photo was copied and reproduced thousands of times all over the Web. If Ms. Streisand had just kept quiet about the whole thing, it probably would have gone unnoticed. Instead, the information she wanted removed spread like a virus and received a huge amount of publicity, thus becoming far more known than she wanted.

And, of course, she had this effect named after her, which further brings up the subject (and photo links) every time someone else tries to suppress information on the Web.

That’s what happened with this HD DVD key. It appeared on a blog and someone dugg it. It soon got lots of diggs. The folks at Digg, acting on a cease and desist order (or rumor that they were about to get one) decided to be proactive and remove the references on Digg. Digg users saw this as censorship and immediately went nuts, posting more blog articles and references to the offensive key code — many of which used the code in the post title. When the Streisand Effect entry was updated on Wikipedia (yesterday, perhaps), the updater noted that there were currently more than 280,000 references to the code, a song, and multiple domain names with variations on the code.

Grant Robertson‘s post on DownloadSquad.com, “HD DVD Key Fiasco is an Example of 21st Century Digital Revolt” said it best:

As Joe Rogan’s character on Newsradio once quite accurately quipped, “Dude, you can’t take something off the Internet.. that’s like trying to take pee out of a swimming pool.” The content providers have attempted to do exactly that, remove pee from the proverbial swimming pool that is the Internet and, as we’ve witnessed so many times before, they’ve failed miserably.

If the AACS Licensing Authority would have kept out of this, the code probably would have come and gone like most material on the Web — within a few days. Instead, the 16-digit number has become “the most famous number on the Web” and is everywhere. What’s worse is that while a week ago, only a few hackers might have known what to do with it to unlock or remove protection from HD DVDs, now it’s likely that someone will go through the bother of writing a software program that does the work for everyone. If that software isn’t already out, I expect it to appear any day now. And I’m sure its location will be dugg so everyone knows about it.

What can we — and others — learn about this? With the Web, nothing is private. If information can be known, it will be known on the Web. But it can remain obscure if — and only if — the owner of the information does nothing to hide it.

What should the AACS Licensing Authority have done? Quietly recall the key code and start using a new one. Or, better yet, just ignore the whole thing. Millions of people would not have known about it at all if AACS had done nothing.

But what this also brings to light is the public’s feelings about DRM. Consumers don’t want it. And now consumers are starting to fight back.

Web Site vs. Blog

What’s the Difference?

Today I got a phone call from our local newspaper’s “business advocate” — the guy who writes stories about business. He was researching an article about blogging and figured that I was the most active blogger — if not the only blogger — in town, so I might be able to to provide some information about it. He wasn’t aware that I’d co-authored a book about WordPress blogging software (WordPress 2: Visual QuickStart Guide) in 2006 and he probably wasn’t aware that I wrote Putting Your Small Business on the Web back in 2000. He probably also doesn’t know that I’ve written four books about Web authoring software (various versions of PageMill, now defunct) and that I’ve been building and managing Web sites since 1994 (although I’m not crazy enough to do it for a living).

We stumbled a bit in our conversation. He referred to my Web site, wickenburg-az.com, as a blog. (The site has been around since 1999, predating the blogging phenomena by at least 3 to 4 years.) I responded that it wasn’t a blog, that it was a Web site built with blogging software. And then he asked me what the difference was.

I had to think about it. What is the difference between a Web site and a blog?

They’re very much alike.

Let’s take a look at the similarities.

  • Web sites and blogs are both published on the Web and can be read with any Web browser. This gives them the same basic look and feel and similar user experiences. Web sites built with blogging software can look and feel just like a blog, even if that’s not what they are intended to be.
  • They depend on good, useful content. Web site visitors and blog readers come to read content. If the content is good and meets their needs, they’ll be back for more. If the content sucks, they won’t.

But they are different.

Of course, I needed to explain how they were different — not how they were the same. The response I came up with centered around the purpose of visitors coming to to the site, but there are more differences.

  • Web site visitors come to a site to look for specific information. That information does not need to be new. It just needs to be what the visitor is looking for. For example, I visit the HP Web site when I need a new driver for one of my printers. I know it’ll be there and I don’t care if it’s been there for five years. People visit wickenburg-az.com to get basic information about Wickenburg: what it’s like, what to do there, etc. But blog readers visit or subscribe to blogs to get fresh information or insight on topics that are important to them. I read ProBlogger, for example, because it has timely articles that can help me understand how to be a better blogger. People visit aneclecticmind.com to read articles like this one about blogging, or other articles about flying, or even other articles about what it’s like to live in a place like Wickenburg — all from my point of view.
  • Blogs tend to be more opinion-based than Web sites. Sure, HP is going to tell you on their Web site that their printers are the best, but what would you expect? On my blog, I’ll tell you what I think about my HP printer and compare it to other printers I might own or have experience with. I’ll also tell you what I think of Apple Geniuses or local restaurants or life revolving around the Internet. (Although some locals might find this hard to believe, I keep most of my negative opinions of Wickenburg out of wickenburg-az.com. Most.) The opinion aspect makes blogs more personal than a Web site.
  • Blogs rely on fresh content. It’s commonly accepted that a blogger should post at least 3 to 5 new entries a week. Web sites, on the other hand, are more static and don’t require as much updating. Their visitors don’t expect it, either.

Does it matter?

Who knows? But it’s made me think about blogging a bit more than usual lately. And I’m sure it will lead to a few more articles here about what makes a blog a blog in the near future.

Computer Wait Speed

Maria Speaks Episode 34: Computer Wait Speed

My current computer woes remind me of something I heard long ago.

A long time ago — ten or more years, which is the middle ages in terms of the computing industry — computers were being marketed primarily on the basis of processor speed. Every time Intel or Motorola would come out with a new processor chip, members of the geeky set hurried to the stories to buy a new computer or upgrade that would bring their machines up to speed. It was then that I heard this rather curious statement:

All computers wait at the same speed.

The statement, of course, was meant to poke fun at computer users. At least that’s how I read it. Your computer could be the fastest in the world, but if you weren’t up to speed, all that extra fast processing power would be wasted. After all, each time a computer completes an instruction — whether it’s opening a dialog box, applying a font style change to some text, or matching e-mail addresses in your address book when you type into a field in a new e-mail message form — the computer faithfully waits…for you. As long as it has to. And while computer processors are getting ever faster, computer users are simply not keeping up.

Let Me Tell You About My Mom

All this reminds me of a sort of funny story. My mother, who has been using computers for nearly as long as I have, is not what you’d call a “power user.” She pretty much knows what her computer can do for her and she can usually make it do it. But she’s not the kind of person who pushes against the boundaries of what she knows very often. And when she’s working with her computer, she spends a lot of time making the computer wait while she thinks about what’s onscreen and how she needs to proceed. That isn’t a big deal — I’d say that 95% of computer users are like her. People react to what the computer does rather than anticipate what’ll come up next and have the next task prepared in their minds when the computer is ready to accept it. And all these computers are waiting at the same speed.

Anyway, for years, my Mom used dial-up Internet services. Most of us did. But as better alternatives came around and Web sites got ever more graphic-intensive, most of us updated our Internet connection technology to take advantage of cable or DSL or some other higher bandwidth connection. (I was literally the first (and only) kid on the block to get ISDN at my home. This was back in the days before cable and DSL Internet service. It cost me a fortune — heck, they had to dig a trench to lay new telephone lines to my house — but I simply could not tolerate busy signals, dropped carriers, and slow download speeds for my work. It operated at a whopping 128 Kbps and cost me $150/month. Ouchie!) My Mom, on the other hand, didn’t upgrade. She continued to surf the Internet through AOL on a dial-up connection, right into late 2006. Worse yet, she refused to get a second phone line, so she limited her Internet access or was impossible to get on the phone.

Let me take a little side trip here to discuss why her attitude wasn’t a bad thing at all. Personally, I believe we have too much dependence on the Internet. I recently read “I Survived My Internet Vacation” by Lore Sjöberg on wired.com, which takes a comic but all-too-real look at Internet withdrawal. If you’re the kind of person who uses the Internet to check the weather, look up vocabulary words, and find obscure information throughout each day without really needing that information, you owe it to yourself to read the piece. It really hit home for me. So in the case of my Mom, the fact that her Internet use was minimal wasn’t such a bad thing. Not at least as far as I was concerned.

But it had gotten to the point with my Mom that she was spending more time waiting for her computer than her computer was waiting for her. And it had nothing to do with processor speed. It was her dial-up Internet connection that made it slow.

At first, I don’t think she understood this. I think that when she replaced her aging Macintosh with a PC about 2 years ago, she really expected everything to get faster. But the Internet got slower and slower for her, primarily because Web designers don’t design sites for dial-up connections. (Shame on them!) The Internet had become a tedious, frustrating place for her and she couldn’t understand why so many people were spending so much time using it.

In November 2006, I came for a visit. I had to look up something on the Internet and within 15 minutes, I was about to go mad. I asked her why she didn’t upgrade to a different service. Then she showed me a flyer that had come with her cable bill. We sat down with her phone bill and AOL bill and realized that she could upgrade to cable Internet service and actually save money. A little more research with her local phone company saved even more money.

So she was paying a premium to connect at 56Kbps or less.

I made a few phone calls and talked to people in the United States and India for her. I’ll be honest with you — the price difference between cable Internet and her local phone company’s Internet was minimal, but we went with the phone company because the person who answered the phone spoke English as her first language. (Subsequently, my Mom needed some tech support after I was gone and that person was in India. Sheesh.) The installation would happen the day after I left to go back to Arizona, but I was pretty confident that they would make everything work. And although it didn’t go as smoothly as we’d hoped, my Mom was soon cruising the ‘Net at normal DSL speeds.

In other words, wicked fast.

My Mom was floored by the difference. I’d told her it was much faster, but I didn’t tell her it was 100 times faster. And it’s always on — all she has to do is turn on the computer and she’s online! And she can even get phone calls while she’s on the Internet! Imagine all that!

The happy ending of this story is that my mother now spends a lot more time on the Internet. (I’m not sure how happy that is.) And of course, she’s now back to the situation where the computer is waiting for her.

Who’s Waiting for What in My Office

I reported a hard disk crash here about 9 days ago. I know it was 9 days because that’s how long I’ve been waiting for the data recovery software to churn through whatever is left of my hard disk. And although it’s still progressing, it’s slowed to a crawl. I think it’s teasing me. But I’ll get the last laugh — I’m pulling the plug today.

There comes a time when you simply can’t wait anymore. I think 9 days shows a great deal of patience on my part. I know I couldn’t have waited so long if I didn’t have other computers to work with. I did get some work done this past week. I wrote up the outline for my Mac OS X book revision for Leopard. I did a lot of e-mail, fixed up a bunch of Web sites, wrote and submitted a bid for Flying M Air to dry cherries this summer in Washington State.

But what I did not do outweighed what I did do. I didn’t work on my Excel 2007 Visual QuickStart Guide. (I need the big monitor to do layout.) I did not pay my bills. (The latest version of my Quicken data files are on the sick drive.) I didn’t update Flying M AIr’s brochure. (Original files on the sick disk, need big monitor for layout.) The list does go on and on.

Now it’s time to get back to work. So I’ll pull the plug on the current data recovery attempt, put the hard disk in the freezer for a few hours, then reinstall it and try again by accessing the sick disk via Firewire from another computer. I can try multiple software solutions to fix the problem. And if that doesn’t work, I take the long drive down to the nearest Genius and let them give the computer a check up to make sure there’s no motherboard damage (again). If the mother board is still fine, I’ll leave them the disk to play with, get a new disk to replace it, and get the hell back to work.

That’s the plan, anyway.

The Lost Painting

History that reads like a novel.

Book CoverI saw The Lost Painting by Jonathan Harr in a copy of Bookmarks, a magazine full of book reviews. I added it to my Amazon.com wish list.

The Taking of ChristThe book is an account of the finding of a painting by Caravaggio, a 17th century Italian artist. The painting, called The Taking of Christ, was found in the 1990s by a restorer.

Evidently, many paintings from that time were lost — they’d be sold by the artist or a dealer to a wealthy patron or art collector to be hung in a home. Over the years, the paintings would be moved around, handed down to descendants, sold, and resold. The records regarding these paintings were not always complete, so paintings would disappear from the records and thus “disappear” from the art world. In some cases, a painting’s value would be understated and the painting, aged, dirty, and possibly damaged would simply be discarded by an owner. Many masterpieces were lost this way.

The book tells the story of how two art history students stumbled upon some evidence that the painting had been sold to a Scotsman in 1802, who believed the painting was done by a different artist. The painting was then traced to an auction house where the trail went cold. Had it been sold? No one knew. And no one knew what had become of it.

The book is written like a novel, complete with dialog and some characterization. But all the characters are real people, many of whom were interviewed by the author during his research. This keeps the book from being a dry history tome. Instead, it has life and is quite interesting to the average reader.

The book was listed on the New York Times Book Review 10 Best Books of the Year for 2006. The edition I read included an epilogue by the author which covers the discovery of another version of the same painting.

I recommend the book to anyone interested in art, history, art history, or the process of searching for lost artwork.