CAPTCHA tests are supposed to distinguish humans from bots, but an AI system mastered the problem after training on thousands of images of road scenes.
An artificial intelligence can solve the CAPTCHA puzzles used by websites to distinguish whether browsers are humans or bots 100 per cent of the time.
CAPTCHA tests try to sort humans from bots by asking users to identify objects in photos lilgrapher/Shutterstock |
Andreas Plesner at ETH Zurich in Switzerland and his colleagues fine-tuned an AI model nicknamed YOLO (You Only Look Once) to become an expert at solving the image-based challenges used to verify identities on websites. The particular type of CAPTCHA it tackled – reCAPTCHAv2, which was developed by Google – asks users to identify certain types of objects, such as traffic lights, among a set of images.
The limited range of objects, all situated within the context of a road, made it easier for YOLO to be trained to complete the tests. “The categories are very limited, so you have to select all images with a traffic light or with a crosswalk [for example],” says Plesner.
The researchers fed around 14,000 pairs of images with corresponding labels to the model to train it on what various road-based objects look like. Overall, reCAPTCHAv2 focuses on around 13 different types of object, such as cars, buses, bicycles and road crossings.
They then tested YOLO’s performance in a number of different situations, looking at factors such as whether it moved the mouse as a human might and whether there were browser histories and cookies installed on the test device. This is because it is believed Google’s bot-detection algorithms look at these factors – known as device fingerprinting – in addition to the answers given in the CAPTCHA challenge.
The AI succeeded at the tests 100 per cent of the time. This doesn’t mean it responded correctly to 100 per cent of the images it was shown, but it could reject some and be offered alternatives, in the same way that humans can. “I was fairly surprised that [CAPTCHA] was that vulnerable,” says Plesner.
“We have a very large focus on helping our customers protect their users without showing visual challenges, which is why we launched reCAPTCHA v3 in 2018,” a Google Cloud spokesperson said in a statement. “Today, the majority of reCAPTCHA’s protections across 7 [million] sites globally are now completely invisible. The potential issues created by image recognition technology are not new, and we are continuously enhancing reCAPTCHA to deter abuse without creating friction for legitimate users.”
“The research is important not just in pointing out the AI success on the image recognition side, but also in highlighting some ways in which CAPTCHA system builders are seeking to mitigate that,” says Eerke Boiten at De Montfort University, UK.
Boiten worries that other elements, such as the device fingerprinting highlighted in the research, will become more important now it is clear that image recognition can be done by AIs at a level similar to humans – making it harder, “or maybe impossible”, to prove to CAPTCHA systems that we are human through our actions alone, he says.
Reference: