Running a subset of attacks
mindgard test
by default runs a range of the most appropriate attack techniques for the AI system you are testing.
However, if you only wish to run a subset of attacks, you can do so by either excluding the attacks that are not required or including only the attacks that are required. It’s also possible to exclude or include a category of attacks.
List of attacks
The following is a list of available attacks with their associated categories.
Category | Attack name |
---|---|
jail_breaking | DevModeV2 |
jail_breaking | PERSONGPT |
jail_breaking | EvilConfidant |
jail_breaking | AntiGPT |
jail_breaking | AsciiArtAttack |
jail_breaking | skeleton_key |
violation | PromptAlignment |
violation | malgen |
prompt_injection | Tag |
prompt_injection | Base64 |
prompt_injection | Base64_DecodeAndAnswer |
prompt_injection | Braille |
prompt_injection | Base32 |
prompt_injection | Base16 |
prompt_injection | Ascii85 |
prompt_injection | EcojiEmoji |
prompt_injection | MorseCode |
prompt_injection | NatoPhoneticAlphabet |
prompt_injection | Homoglyph |
prompt_injection | Diacritics |
prompt_injection | Rot13 |
prompt_injection | Hexadecimal |
prompt_injection | CaeserCipher |
prompt_injection | Cursed |
prompt_injection | PigLatin |
prompt_injection | ZeroWidthSpace |
output_handling | Ansi_Raw |
output_handling | Ansi_Escaped |
meta_prompt_extraction | DynamicTest |
Excluding attacks by name
mindgard test --exclude 'AntiGPT' --exclude 'PersonGPT'
The above command will not run the AntiGPT
and PersonGPT
attacks and will run the rest of the attacks.
Excluding categories
mindgard test --exclude 'jail_breaking'
The above command will exclude all attacks that are associated with the jail_breaking
category.
Including attacks
The --include
option can be used to only run a specific set of attacks.
mindgard test --include 'Base32' --include 'Rot13'
The above command will only run the Base32
and Rot13
attacks and will exclude all other attacks.
The same could be done to include a set of attacks associated to one or more categories
mindgard test --include 'output_handling' --include 'prompt_injection'
The above command will only run the attacks associated with output_handling
and prompt_injection
categories.