the study
Published January 2, 2024 Author
Gamareldin Elsayed and Michael Moser
New research shows that even subtle changes to digital images designed to confuse computer vision systems can also affect human perception
Computers and humans see the world differently. Our biological systems and mechanical artificial systems do not always pay attention to the same visual signals. Neural networks trained to classify images can be completely misled by subtle perturbations to the images that humans don’t notice.
That an AI system can be fooled by such adversarial images may indicate a fundamental difference between human and machine cognition, but it also allows us to This led us to explore whether humans might also be sensitive to the same perturbations under similar test conditions. In a series of experiments published in the journal Nature Communications, we found evidence that human judgment is indeed systematically influenced by adversarial perturbations.
Our findings highlight similarities between human and machine vision, but also indicate that further research is needed to understand how adversarial images affect humans and AI systems. Masu.
What is a hostile image?
Adversarial images are images that have been subtly modified through steps that ensure that the AI model misclassifies the image’s content. This deliberate deception is known as a hostile attack. For example, you can target an attack so that your AI model classifies a vase as a cat, or you can design your model to recognize something other than a vase.
Left: An artificial neural network (ANN) correctly classifies the image as a vase, but when perturbed by a seemingly random pattern across the image (middle), the intensity is expanded for explanation. The resulting image (right) is incorrect. Confidently misclassified as a cat.
And such attacks can be sophisticated. In a digital image, each pixel in an RGB image represents the intensity of the individual pixel on a scale of 0 to 255. Adversarial attacks can be effective even if no pixels are modulated by more than two levels at that scale.
Adversarial attacks against real-world physical objects, such as mistaking a stop sign for a speed limit sign, can also be successful. In fact, security concerns have led researchers to investigate ways to counter and reduce the risk of adversarial attacks.
How is human cognition influenced by adversarial examples?
Previous studies have shown that people may be sensitive to large amplitude image perturbations that provide clear shape cues. However, the effects of more subtle adversarial attacks are poorly understood. Do people ignore perturbations in images as harmless random image noise, or can they potentially affect human perception?
To find out, we performed a controlled behavioral experiment. First, we took a set of original images and performed two adversarial attacks on each to generate a large number of perturbed image pairs. In the animated example below, the original image is classified as a “vase” by the model. The two images disrupted by the adversarial attack on the original image are confidently misclassified by the model as the adversarial targets “cat” and “truck,” respectively.
Human participants were then shown two pictures and asked a targeted question: “Which picture looks more like a cat?” Although neither image resembled a cat, they typically reported feeling forced to make a choice and feeling like they were making an arbitrary choice. If brain activation is insensitive to subtle adversarial attacks, we would expect people to select each image 50% of the time on average. However, we find that even if no pixels are tuned by more than two levels on a scale of 0 to 255, the selection rate (which we call perceptual bias) is reliably above chance for various perturbed image pairs. I did.
From the participant’s perspective, it feels like they are being asked to distinguish between two virtually identical images. However, there is ample evidence in the scientific literature that people draw on weak perceptual signals when making choices, signals that are too weak to express confidence or awareness. In this example, you see a vase, but some activity in your brain tells you that there’s a cat there.
Left: Example of adversarial image pairs. The pair of images above are subtly perturbed at the ~2-pixel level, causing the neural network to misclassify them as “trucks” and “cats,” respectively. Human volunteers are asked, “Which one is more cat-like?” It is more obvious that the pair of images below has been manipulated to be misclassified as “chair” and “sheep” at up to the 16 pixel level. This time’s topic is “Which one looks like a sheep?”
For our paper in Nature Communications, we carried out a series of experiments that rule out the possibility of artificial explanations for the phenomenon. In each experiment, participants reliably selected the adversarial image that corresponded to the target question more than half of the time. Although human vision is less susceptible to adversarial perturbations than machine vision (machines no longer discriminate the original image class, humans still clearly perceive it), our study shows that these perturbations can nevertheless bias humans toward machine decisions.
AI safety and the importance of security research
The key finding that human perception can be subtly influenced by adversarial images raises serious questions for AI safety and security research, but formal experiments have shown that AI vision can be subtly influenced. By exploring the similarities and differences in the behavior of systems and human perception, we can: Use insights to build safer AI systems.
For example, our findings may inform future research aiming to improve the robustness of computer vision models by better aligning them with human visual representations. Measuring human susceptibility to adversarial perturbations could help determine the integrity of different computer vision architectures.
Our study also points to the need for further research to understand the broader impact of technology on humans, not just machines. This highlights the continued importance of cognitive science and neuroscience to better understand AI systems and their potential impact as we focus on building safer and more secure systems. Masu.