Attack Goal

Square Attack is an evasion attack which applies small square patches on an image, tweaking just those specific areas with the aim of evading correct classification by the target model.

Example

Consider a dishonest agent trying to trick a model that classifies images of animals. The attacker has a clear image of a “cat” that the model identifies correctly. The square attack run by the attacker starts placing small, random squares of noise on the image. Over time, it finds the right spots and size of squares that, while the image still looks like a cat to you, the model might start thinking it’s something else, like a “dog.”

Remediation