Goal

Create a stolen copy of a model.

Impact

Attackers can use a stolen model to stage further attacks, or steal critical intellectual property.

How do these attacks work?

A stolen model is created using extraction techniques to collect information via access to a target model or its underlying system. Extracts characteristics of a model such as: the architecture, parameters, and hyper-parameters.

Example Threat Scenario

A military contractor has developed and released an aerial drone detection system that relies on machine learning algorithms to recognize drones in aerial images. The system processes input from cameras or LiDAR to detect the shape, size, and other drone-specific features and returns an alert or a classification with a confidence score that a drone is present. However, an attacking nation state begins adding small, strategically crafted stickers or patches onto the drone that appear to the human eye as innocuous, random patterns. These patches are computed by an attacker understanding what perturbations degrade the performance of the target model to detect drones via a staged image evasion attack upon the model. An attacker with access to the underlying target model either via exfiltration of technology or software can utilise image evasion attacks to compute minimal perturbations required to fool the target model. These offline computed perturbations transfer to the real world when applied onto the objects the attacker is wishing to masking from the target model, therefore tricking the aerial drone system.

Remediation

Further Reading

MITRE Atlas Technique AML.T0015