ART MIFace
Goal
MiFace is a white-box model inversion attack which attempts to extract the original training data used to train the target model.
Example
Consider an attacker that needs access to the target model, either through a white-box scenario (full access to model parameters) or a black-box scenario (only API access for predictions). - The attack starts by creating an initial random noise image or using a image template. - The initialised image is then passed through the target model to obtain the computed gradients during inference. - A loss function is computed based on the model’s output and the desired target class. - The input image is updated based on the loss to improve confidence for the target class. - This is repeated for a set number of iterations or until a stopping criterion is met. - The final optimized image is considered the reconstructed representation of a specific class from the target model.