Reduce the information outputted from a model (confidence values, input size, token limits), reducing the ability for an adversary to extract information and optimize attacks for the model. Such recommendations include confidence rounding, Gradient Masking.

Explanation

 

How it works

 

How to implement