Attack Goal

Boundary Attack is an evasion attack that creates minimal perturbations which aim to cause the target classifier to incorrectly classify a given image. The intent is to produce perturbated images that are imperceptible to a human viewer but drastically reduce model performance.

Example

The attack starts with an image that is already misclassified (e.g. random noise) and iteratively refines it with small changes to appear more similar to an image correctly classified to a target class, while maintaining the misclassification.

Remediation