Monthly Archives: February 2006

CAPTCHA the Internet #

CAPTCHA example

CAPTCHA (an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”) has been on my mind ever since Phil Windley suggested a graphical CAPTCHA would make a good web service. I thought there might be those willing to pay to use it. Well, it’s been done.

There is a need for this type of test. Yahoo! and Hotmail use a CAPTCHA to stave off spammers when a user requests an email account. I suspect the most common use on other sites is an attempt block automated comment spam in blogs.

CAPTCHA excludes legitimate users

As the W3C points out graphical CAPTCHAs are a significant barrier to low-vision and blind users. Those with learning disabilities, such as dyslexia, may also be adversely affected. As visual CAPTCHAs become more sophisticated, busy, patterned background becomes more of an issue for color-blind users.

The U.S. Census Bureau estimated that in 1997 about 7.7 million Americans had difficulty seeing the words and letters in an ordinary newspaper. The American Foundation for the blind reported about 5 in 1,000 Americans are legally blind, and gives a low estimate of 1.5 million visually impaired computer users. That’s a fairly significant potential market to ignore.

Requiring users to interpret a visual CAPTCHA may lead to legal challenges. Earlier this month, the National Federation for the Blind filed suit against Target, claiming target.com discriminates by not being accessible to visually impaired users.

Audio CAPTCHA

Some companies are experimenting with audio CAPTCHAs, spelling out random letters with random noise in the background. However, aural disabilities are more common than visual ones, so the approach isn’t really more accessible. Speech recognition software is more advanced than character recognition, so the purported purpose of differentiating between humans and computers is not filled anyway.

CAPTCHA is broken

Several projects to crack common visual CAPTCHA algorithms, particularly The CAPTCHA Project (by the Carnegie Mellon School of Computer Science), the UC Berkeley Computer Vision Group, and Sam Hocevar’s PWNtcha, have had good success. Howard Yeend demonstrated a vulnerability in several public algorithms where he could reuse a solution several thousand times after manually solving it once.

Social engineering is often easier than fancy programming. The first widely recognized social engineering solution was “borrowing” CAPTCHAs from target sites and showing them at entry points to porn sites. Visitors to porn sites would solve the CAPTCHAs, allowing spammers to get essentially free labor. Amazon’s Mechanical Turk (tagline: “Artificial Artificial Intelligence”), which gives micro-payments for simple tasks is an example of another way CAPTCHAs could be defeated. Even at a few cents per image, the cost may still be too high for spammers, but it is a demonstration that the process can be outsourced. After all, the world is flat.

What is the underlying purpose?

The real reason for CAPTCHA is to screen undesirables. For low traffic sites, it means preventing automated access. This can be accomplished in a relatively simple way: add a single required question to the comment submit form. Something like “What color was George Washington’s white horse?” or “Enter the fourth word in this sentence.” This is enough to make the form non-standard, thus unusable by generic bots. Bypassing this added security would be very easy for spammers, the advantage is the relative obscurity of most blogs. To target multiple blogs, a spammer would need to address each one individually; individual attention is unlikely, so I suggest this method is the easiest for bloggers with a knowledge of web programming, and is as accessible as a comment form without a CAPTCHA.

Major sites like Yahoo! and Google have a bigger problem. After all, they are targets both because of the value of their services, and their size. When it first launched Gmail, Google limited accounts to those who had been invited by other active users. Initially there was a good bit of commotion in the tech community as gmail.com addresses became a sign of prestige. The invitation system allows Google to track which users may be abusing the service, and which users invited the abusers. Google has gone a step further, and now allows potential users to have an invitation code sent to their mobile phones. The number of accounts requested per phone number can be tracked. The potential gain from a limited handful of throw-away email accounts, and the cost of mobile phones (even disposable ones) is enough to deter spammers, because less troublesome alternatives exist.

If you look at Google’s account request page, you’ll see a CAPTCHA there. Google responsibly offers a way for users with disabilities to bypass the CAPTCHA, although it involves human-to-human interaction (and quite a bit more time) to complete—a costly alternative.

Real solutions

Several solutions to the problems with CAPTCHA have been proposed and debated. Most have major cost or accessibility problems.

It would seem the only good solution is some sort of federated identity system, which is really just offloading the trouble of user validation to someone else.

One Response

IE DOM Bugs #

I’ve been working on a Javascript project where it’s necessary to create input elements (radio buttons and checkboxes) dynamically. With a functional DOM, it takes only a couple of lines of code, and works fine in Firefox and Safari. Too bad IE isn’t as DOM compatible as it claims to be.

After several searches, I discovered IE doesn’t allow the name attribute to be changed after the element is created—and it can’t be set in a DOM compatible way during creation.

Bennett McElwee suggested a solution in his blog that is nicely cross-browser; anything other than IE throws an exception and gets created properly. (I suspect modifying the parent node’s innerHTML would work as well.)

function createElement(type, name) {
   var element = null;

   try {
      // First try the IE way; if this fails then use the standard way
      element = document.createElement('<'+type+' name="'+name+'">');
   } catch (e) {
      // Probably failed because we're not running on IE
   }
   if (!element) {
      element = document.createElement(type);
      element.name = name;
   }
   return element;
}

And your type attribute too

Another part of the project requires we transform checkboxes to radio buttons and hidden fields. This could be accomplished through a page reload, but it’s overkill for such a small change. Once again, in truly DOM-compliant browsers, this requires only a couple of lines of code:

element.setAttribute("type", "radio");

The MSDN reference for the type attribute says:

As of Microsoft Internet Explorer 5, the type property is read/write-once, but only when an input element is created with the createElement method and before it is added to the document.

QuirksMode has a bug report for this, complete with test page and workaround submitted by Stijn Peeters. Stijn admits the workaround needs a little bit of cleanup.

Essentially, his solution is to always remove the element, and recreate a modified one. (See the bug above!) Here’s my solution:

try {
   element.setAttribute("type", "radio");
} catch (e) {
   var newElement = null;
   var tempStr = element.getAttribute("name");
   try {
      newElement = document.createElement("<input type=\"" +typeStr+ "\" name=\"" +tempStr+ "\">");
   } catch (e) {}
   if (!newElement) {
      newElement = document.createElement("input");
      newElement.setAttribute("type", "radio");
      newElement.setAttribute("name", tempStr);
   }
   if (tempStr = element.getAttribute("value")) {
      newElement.setAttribute("value", tempStr);
   }
   element.parentNode.replaceChild(newElement, element);
}

Update:

Aaron over at easy-reader.net encountered the same problem a few months before I did. His solution is similar, and the comments there are good. If only I had found it sooner!

One Response

Safari “Debug” menu #

While trying to track down a Javascript bug in Safari, I was lamenting the lack of a Javascript console and DOM explorer. Then I found a tip on MacOSXHints.com, explaining how to enable the Debug menu. From the terminal command line, enter the following:

% defaults write com.apple.Safari IncludeDebugMenu 1

Hire Tom! Hire Tom!