mindgard test by default runs a range of the most appropriate attack techniques for the AI system you are testing.

However, if you only wish to run a subset of attacks, you can do so by either excluding the attacks that are not required or including only the attacks that are required. It’s also possible to exclude or include a category of attacks.

List of attacks

The following is a list of available attacks with their associated categories.

CategoryAttack name
jail_breakingDevModeV2
jail_breakingPERSONGPT
jail_breakingEvilConfidant
jail_breakingAntiGPT
jail_breakingAsciiArtAttack
jail_breakingskeleton_key
violationPromptAlignment
violationmalgen
prompt_injectionTag
prompt_injectionBase64
prompt_injectionBase64_DecodeAndAnswer
prompt_injectionBraille
prompt_injectionBase32
prompt_injectionBase16
prompt_injectionAscii85
prompt_injectionEcojiEmoji
prompt_injectionMorseCode
prompt_injectionNatoPhoneticAlphabet
prompt_injectionHomoglyph
prompt_injectionDiacritics
prompt_injectionRot13
prompt_injectionHexadecimal
prompt_injectionCaeserCipher
prompt_injectionCursed
prompt_injectionPigLatin
prompt_injectionZeroWidthSpace
output_handlingAnsi_Raw
output_handlingAnsi_Escaped
meta_prompt_extractionDynamicTest

Excluding attacks by name

mindgard test --exclude 'AntiGPT' --exclude 'PersonGPT'

The above command will not run the AntiGPT and PersonGPT attacks and will run the rest of the attacks.

Excluding categories

mindgard test --exclude 'jail_breaking'

The above command will exclude all attacks that are associated with the jail_breaking category.

Including attacks

The --include option can be used to only run a specific set of attacks.

mindgard test --include 'Base32' --include 'Rot13'

The above command will only run the Base32 and Rot13 attacks and will exclude all other attacks.

The same could be done to include a set of attacks associated to one or more categories

mindgard test --include 'output_handling' --include 'prompt_injection'

The above command will only run the attacks associated with output_handling and prompt_injection categories.