Of Captchas, Gimpys and BaffleText …

Paladion
By Paladion

November 15, 2004

Automated computer programs, or bots, can repeatedly hit your web site and execute thousands of requests a minute. These bots can mount brute force attacks to break passwords, automate registrations, fake large volume of support queries, etc. If you haven’t taken protection against these yet, you might want to evaluate the options. In this article, we look at the state of the art in foiling bots

Automated computer programs, or bots, can repeatedly hit your web site and execute thousands of requests a minute. These bots can mount brute force attacks to break passwords, automate registrations, fake large volume of support queries, etc. If you haven’t taken protection against these yet, you might want to evaluate the options. In this article, we look at the state of the art in foiling bots.

Anti-bot technology tries to recognize a bot by betting on bots stumbling at tasks humans do easily. It involves a new breed of Turing tests to distinguish real people from intelligent computer programs. In the traditional Turing test, a person distinguishes a machine from a human being by asking them questions and analyzing their answers. While the traditional Turing test relied on a person to differentiate between a human being and a computer, the anti-bot tests make a computer differentiate between a machine and a human! These tests are thus known as Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA).

GIMPYs

Two popular CAPTCHAs developed by Carnegie Mellon University are called EZ-GIMPY and GIMPY. Both are based on the difficulty of reading distorted text. EZ-GIMPY presents a word from its dictionary to the user as a distorted image. GIMPY selects seven words from a dictionary and renders a distorted image containing the words. GIMPY

GIMPY then presents a test which consists of the distorted image and the directions: “type three words appearing in the image.” Given the types of distortions that both the GIMPYs use, most humans can read the words from the distorted image, but current computer programs cannot.

The majority of CAPTCHAs used today, including the ones popularized by Yahoo and Hotmail to prevent automated registrations and password guessing attacks, are similar to GIMPY. They rely on the difficulty of optical character recognition--the difficulty of reading distorted text. A more complex version of Gimpy superimposes eight words over one another and adds similar distortions.

BaffleText

Scientists at the Palo Alto Research Center have designed a new breed of CAPTCHA called BaffleText that follows the same approach as GIMPY but distorts the image much more than GIMPY. BaffleText

They print out the image and scan it back in or apply a technique called threshold--transferring the image from color to black and white and back again. This changes gray levels and adds random noise to the image. The image deteriorates until pattern recognition systems fail. Further, BaffleText unlike the earlier GIMPY, uses only nonsense words.

On the flip side, GIMPYs and BaffleText make browsing more difficult for the visually impaired. Standard screen-reading programs to read a Web site cannot read CAPTCHAs. A work-around could be a CAPTCHA that relies on the differences in ability between humans and computers in recognizing speech. That CAPTCHA could pick a word or a sequence of numbers randomly and render it into a distorted sound clip for the user to recognize. In the coming months you can expect to see more CAPTCHAs that make life tougher for the bots, yet keep the Internet simple for humans.


Tags: Technical

About

Paladion