Techniques for reliably fooling AI machine-vision classifiers

The Open AI researchers were intrigued by a claim that self-driving cars would be intrinsically hard to fool (tricking them into sudden braking maneuvers, say), because "they capture images from multiple scales, angles, perspectives, and the like."

So they created a set of image-presentation techniques that reliably trick image classifiers, showing that their tricks work from different angles, at different scales, and after transformations.


Out-of-the-box adversarial examples do fail under image transformations. Below, we show the same cat picture, adversarially perturbed to be incorrectly classified as a desktop computer by Inception v3 trained on ImageNet. A zoom of as little as 1.002 causes the classification probability for the correct label tabby cat to override the adversarial label desktop computer.


Robust Adversarial Examples
[Anish Athalye/Open AI]


(via 4 Short Links)