Article

Beyond CAPTCHA: No Bots Allowed!

Page: 1 2 3 Next

Alternatives to CAPTCHA

The purpose of CAPTCHA systems is to protect resources from bots while allowing access to humans, but they fail to do either of those things.

On the other hand, anyone who's used such a system on a high-traffic site knows that they do make a difference. Abandoning them increases the volume of unwanted traffic, sometimes to an unmanageable extent.

Clearly there's a need for something. So what are the alternatives to CAPTCHA?

Non-linguistic Visual Tests

Tests that use images other than words may be generally easier for users, since all they have to do is comprehend an undistorted picture, rather than decode distorted language. A prominent (and, I believe, pioneering) example of this is KittenAuth:

KittenAuth test

The system shows you a set of nine images, three of which are kittens. You have to identify the three kittens in order to pass authentication.

Although the failure rate for regular humans may be lower, and their comprehension by people with cognitive disabilities may be better, they still let down users who are blind or partially sighted. They also require a basic level of knowledge -- you have to know what a kitten looks like. It's easy to take that much for granted, but it remains a highly cultural assumption; you might know, but can you be absolutely sure that all of your users do?

This idea has also been taken to more frivolous places, such as a system based on the somewhat dubious "hot or not" tests, shown here.

The Hot-or-Not test

Some may find that version funny, others may find it offensive. Either way, it's no use as a genuine authentication system. The answers are arbitrary, and in any case, they can be mined programmatically from the Hot-or-Not web site!

Audio Tests

An alternative to a visual CAPTCHA test is an audio test, where a series of words or letters are spoken out loud and offered to users as an audio file; this audio is also overlaid with distortion of some kind, in the same attempt to prevent programmatic decoding.

However, such tests have exactly the same issues as visual CAPTCHAS. They solve the visual issue, sure, but they do so by introducing another, equally problematic barrier. People who are deaf and blind, who work in a noisy environment, lack the necessary hardware for sound output, or are unable to understand the sound due to a cognitive disability, or even a language barrier, are no better supported than with a conventional visual test.

Also, audio tests are as equally vulnerable to being cracked by suitably motivated bot programmers as visual ones.

Logical or Semantic Puzzles

Eric Meyer's Gatekeeper plugin for WordPress works by asking a simple question, framed in such a way as to make it extremely difficult for machines to understand while blatantly obvious to humans. Would you get this one?

A simple question

Other questions might be "What color is an orange?" or "How many sides has a triangle?"

The Achilles heel of this system is its scope. It has a limited number of questions and answers and is therefore vulnerable to brute-force attack. That problem can be reduced -- but not solved completely -- using flood-control (preventing a single user from making multiple attempts within a certain timeframe) and by ensuring that the selection of questions is large and frequently changed.

But the system is also underpinned by assumptions of knowledge. Ideally, the questions should be so simple that a child could answer them easily -- as is certainly the case in this example. But for every question, we still have to assume that any human can answer it, which may not be true, especially when you factor cognitive disability or language barriers into the equation.

And as a system such as this proliferates, it may become increasingly difficult to think of good questions. We might end up resorting to jokes!

Multiple-choice joke test

Unfortunately a system based on multiple choices like this would be very weak, because simple guesswork would produce a crack rate of 33%. Yet if we allowed freeform answers to a question like that, there's far too much of an assumed-knowledge overhead -- the user would have to recognize the joke, and then give an answer that the system can comprehend as correct.

Individual Authentication

For the highest level of security, individual authorization is always required. To log in to online banking, pay a credit-card bill, or vote, the system needs to know not just that you're a human, but that you're a specific human.

This kind of authentication could be harnessed to provide a lower level of certainty in more general applications, as authentication for a system where your specific identify is not required -- only that you're a person.

The simplest approach here is to require users to register before being able to comment, post, or add content to a site. This certainly reduces the amount of casual spam that a system might get, but it does nothing to put off a determined spammer who's prepared to take the time to create an account.

It's not difficult to find large numbers of people prepared to do this kind of work for next to nothing, given the wide range of living costs across the world economy. It would be trivially cheap for a spammer in a rich country to pay people in a poor country to do this kind of work all day.

Centralized Sign-on

A system of centralized sign-on can mitigate the potential for abuse by putting all the impetus on a single system to authenticate users once, and then give them free rein thereafter.

Systems such as Microsoft Passport offer this kind of centralization; however, they also create significant privacy questions, as you have to be prepared to trust your personal data to a single, commercial entity (quite apart from the fact that Passport uses CAPTCHA authentication!).

However, a most promising alternative to this has recently begun to gain traction, in the form of OpenID. The OpenID system avoids privacy issues because it isn't limited to a single authentication provider -- you can pick and choose, and change at any time, who you trust to hold your authentication information. This information in turn is not revealed to the site you're visiting; therefore, it offers a convenient means of centralized authentication without the attendant privacy issues.

The weak point of the system is how you obtain an OpenID in the first place, since some form of authentication is going to be required there. Simply having an OpenID is not enough to prove that you're a legitimate user, so the onus would end up being on individual sites or OpenID providers to police the use of OpenID; for example, by banning OpenIDs that are known to be spammers. This in itself could end up being a minefield for disputes.

OpenID is a good idea, and is bound to catch on, but in itself does not address the issue at hand any better than individual authentication.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links